In the literature there are only fewpapers concerned with classification methods for multi-way arrays. The mostcommon procedure, by far, is to unfold the multi-way data array into an ordinary matrix and then to apply thetraditional multivariate tools for classification. As opposed to unfolding the data several possibilities exist forbuilding classification models more directly based on the multi-way structure of the data. As an example, multiwaypartial least squares discriminant analysis has been used as a supervised classification method, anotheralternative that has been investigated is to perform classification using Fisher's LDA or SIMCA on the score matrixfrom e.g. a PARAFAC or a Tucker model. Despite a few attempts of applying such multi-way classificationapproaches, no-one has looked into how such models are best built and implemented.In this work, the SIMCA method is extended to three-way arrays. Included in this work is also actual code thatwill work on general multi-way arrays rather than just three-way arrays. In analogy with two-way SIMCA, adecomposition model is separately built for the multi-way data for each class, using multi-way decompositionmethod such as PARAFAC or Tucker3. In the choice of the best class dimensionality, i.e. number of latent factors,both the results of cross-validation but mainly the sensitivity/specificity values are evaluated. In order toestimate the class limits for each class model, orthogonal and score distances are considered, and differentstatistics are implemented and tested to set confidence limits for these two parameters. Classificationperformance using different definitions of class boundaries and classification rules, including the use of crossvalidatedresiduals and scores is compared.The proposed N-SIMCA methodology and code, besides simulated data sets of varying dimensionality, has beentested on two case studies, concerning food authentication tasks for typical food products.
A classification tool for N-way array based on SIMCA methodology / Durante, Caterina; Rasmus, Bro; Cocchi, Marina. - In: CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS. - ISSN 0169-7439. - STAMPA. - 106:1(2011), pp. 73-85. [10.1016/j.chemolab.2010.09.004]
A classification tool for N-way array based on SIMCA methodology
DURANTE, Caterina;COCCHI, Marina
2011
Abstract
In the literature there are only fewpapers concerned with classification methods for multi-way arrays. The mostcommon procedure, by far, is to unfold the multi-way data array into an ordinary matrix and then to apply thetraditional multivariate tools for classification. As opposed to unfolding the data several possibilities exist forbuilding classification models more directly based on the multi-way structure of the data. As an example, multiwaypartial least squares discriminant analysis has been used as a supervised classification method, anotheralternative that has been investigated is to perform classification using Fisher's LDA or SIMCA on the score matrixfrom e.g. a PARAFAC or a Tucker model. Despite a few attempts of applying such multi-way classificationapproaches, no-one has looked into how such models are best built and implemented.In this work, the SIMCA method is extended to three-way arrays. Included in this work is also actual code thatwill work on general multi-way arrays rather than just three-way arrays. In analogy with two-way SIMCA, adecomposition model is separately built for the multi-way data for each class, using multi-way decompositionmethod such as PARAFAC or Tucker3. In the choice of the best class dimensionality, i.e. number of latent factors,both the results of cross-validation but mainly the sensitivity/specificity values are evaluated. In order toestimate the class limits for each class model, orthogonal and score distances are considered, and differentstatistics are implemented and tested to set confidence limits for these two parameters. Classificationperformance using different definitions of class boundaries and classification rules, including the use of crossvalidatedresiduals and scores is compared.The proposed N-SIMCA methodology and code, besides simulated data sets of varying dimensionality, has beentested on two case studies, concerning food authentication tasks for typical food products.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris