Large volumes of data are routinely collected during bioprocessoperations and, more recently, in basic biological research usinggenomics-based technologies. While these data often lack sufficientdetail to be used for mechanism identification, it is possible that theunderlying mechanisms affecting cell phenotype or process outcomeare reflected as specific patterns in the overall or temporal sensorlogs. This raises the possibility of identifying outcome-specificfingerprints that can be used for process or phenotype classificationand the identification of discriminating characteristics, such asspecific genes or process variables. The aim of this work is to providea systematic approach to identifying and modeling patterns inhistorical records and using this information for process classifica-tion. This approach differs from others in that emphasis is placed onanalyzing the data structure first and thereby extracting potentiallyrelevant features prior to model creation. The initial step in this over-all approach is to first identify the discriminating features of the rele-vant measurements and time windows, which can then be subse-quently used to discriminate among different classes of processbehavior. This is achieved via a mean hypothesis testing algorithm.Next, the homogeneity of the multivariate data in each class isexplored via a novel cluster analysis technique called PC1 TimeSeries Clustering to ensure that the data subsets used accuratelyreflect the variability displayed in the historical records. This will bethe topic of the second paper in this series. We present here themethod for identifying discriminating features in data via meanhypothesis testing along with results from the analysis of case studiesfrom industrial fermentations.

Mining of Biological Data I: Identifying Discriminating Features via Mean Hypothesis Testing / Kamimura, R. T.; Bicciato, Silvio; Shimizu, H.; Alford, J.; Stephanopoulos, G. N.. - In: METABOLIC ENGINEERING. - ISSN 1096-7176. - STAMPA. - 2(3):(2000), pp. 218-227. [10.1006/mben.2000.0154]

Mining of Biological Data I: Identifying Discriminating Features via Mean Hypothesis Testing

BICCIATO, Silvio;
2000-01-01

Abstract

Large volumes of data are routinely collected during bioprocessoperations and, more recently, in basic biological research usinggenomics-based technologies. While these data often lack sufficientdetail to be used for mechanism identification, it is possible that theunderlying mechanisms affecting cell phenotype or process outcomeare reflected as specific patterns in the overall or temporal sensorlogs. This raises the possibility of identifying outcome-specificfingerprints that can be used for process or phenotype classificationand the identification of discriminating characteristics, such asspecific genes or process variables. The aim of this work is to providea systematic approach to identifying and modeling patterns inhistorical records and using this information for process classifica-tion. This approach differs from others in that emphasis is placed onanalyzing the data structure first and thereby extracting potentiallyrelevant features prior to model creation. The initial step in this over-all approach is to first identify the discriminating features of the rele-vant measurements and time windows, which can then be subse-quently used to discriminate among different classes of processbehavior. This is achieved via a mean hypothesis testing algorithm.Next, the homogeneity of the multivariate data in each class isexplored via a novel cluster analysis technique called PC1 TimeSeries Clustering to ensure that the data subsets used accuratelyreflect the variability displayed in the historical records. This will bethe topic of the second paper in this series. We present here themethod for identifying discriminating features in data via meanhypothesis testing along with results from the analysis of case studiesfrom industrial fermentations.
2(3)
218
227
Mining of Biological Data I: Identifying Discriminating Features via Mean Hypothesis Testing / Kamimura, R. T.; Bicciato, Silvio; Shimizu, H.; Alford, J.; Stephanopoulos, G. N.. - In: METABOLIC ENGINEERING. - ISSN 1096-7176. - STAMPA. - 2(3):(2000), pp. 218-227. [10.1006/mben.2000.0154]
Kamimura, R. T.; Bicciato, Silvio; Shimizu, H.; Alford, J.; Stephanopoulos, G. N.
File in questo prodotto:
File Dimensione Formato  
Kamimura_MetabolicEng1.pdf

Accesso riservato

Tipologia: Versione pubblicata dall'editore
Dimensione 223.08 kB
Formato Adobe PDF
223.08 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/421609
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 21
  • ???jsp.display-item.citation.isi??? 21
social impact