In this article we are considering the exploratory graphical approach to multivariate outliers detection based on Kohonen networks (Kohonen, 1982, 1995). These networks, generally known as self-organising maps (SOM), are able to find interesting low-dimensional projections of high-dimensional data. The utility of the SOM based strategy, especially for Statistical Offices, in controlling the quality of data and finding multidimensional outliers, arises from a number of reasons: it is an easy-to interpret tool for routine exploration of large data set, it can be used in every context, without the specification of an underlying model and it requires very low computational costs.An example on a real data set shows that SOM can be expected to work reasonably well in visualising multivariate outliers. In particularly, outliers identified are in a general agreement with those detected by other well-known statistical procedures such as factor analysis and k-means cluster analysis. The SOM is also shown to be a robust method, since any substantial difference in the qualitative behaviour of the algorithm, due to choice of either alternative neighbourhood functions or differently sized maps, is empirically observed.
Multivariate outliers detection with Kohonen networks: an useful tool for routine exploration of large data sets / Morlini, Isabella. - STAMPA. - 2:(1998), pp. 345-350. (Intervento presentato al convegno Seminar on New Techniques and Technologies for Statistics tenutosi a Sorrento, Italy nel 4-6 Novembre 1998).
Multivariate outliers detection with Kohonen networks: an useful tool for routine exploration of large data sets
MORLINI, Isabella
1998
Abstract
In this article we are considering the exploratory graphical approach to multivariate outliers detection based on Kohonen networks (Kohonen, 1982, 1995). These networks, generally known as self-organising maps (SOM), are able to find interesting low-dimensional projections of high-dimensional data. The utility of the SOM based strategy, especially for Statistical Offices, in controlling the quality of data and finding multidimensional outliers, arises from a number of reasons: it is an easy-to interpret tool for routine exploration of large data set, it can be used in every context, without the specification of an underlying model and it requires very low computational costs.An example on a real data set shows that SOM can be expected to work reasonably well in visualising multivariate outliers. In particularly, outliers identified are in a general agreement with those detected by other well-known statistical procedures such as factor analysis and k-means cluster analysis. The SOM is also shown to be a robust method, since any substantial difference in the qualitative behaviour of the algorithm, due to choice of either alternative neighbourhood functions or differently sized maps, is empirically observed.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris