The spectrum of modern molecular high-throughput assaying includes diverse technologies such as microarray gene expression, miRNA expression, proteomics, DNA methylation, among many others. Now that these technologies have matured and become increasingly accessible, the next frontier is to collect "multi-modal" data for the same set of subjects and conduct integrative, multi-level analyses. While multi-modal data does contain distinct biological information that can be useful for answering complex biology questions, its value for predicting clinical phenotypes and contributions of each type of input remain unknown. We obtained 47 datasets/predictive tasks that in total span over 9 data modalities and executed analytic experiments for predicting various clinical phenotypes and outcomes. First, we analyzed each modality separately using uni-modal approaches based on several state-of-the-art supervised classification and feature selection methods. Then, we applied integrative multi-modal classification techniques. We have found that gene expression is the most predictively informative modality. Other modalities such as protein expression, miRNA expression, and DNA methylation also provide highly predictive results, which are often statistically comparable but not superior to gene expression data. Integrative multi-modal analyses generally do not increase predictive signal compared to gene expression data.

Information content and analysis methods for multi-modal high-throughput biomedical data / Ray, B; Henaff, M; Ma, S; Efstathiadis, E; Peskin, Er; Picone, Marco; Poli, Tito; Aliferis, Cf; Statnikov, A.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 4:(2014), pp. 1-10. [10.1038/srep04411]

Information content and analysis methods for multi-modal high-throughput biomedical data

PICONE, Marco;
2014

Abstract

The spectrum of modern molecular high-throughput assaying includes diverse technologies such as microarray gene expression, miRNA expression, proteomics, DNA methylation, among many others. Now that these technologies have matured and become increasingly accessible, the next frontier is to collect "multi-modal" data for the same set of subjects and conduct integrative, multi-level analyses. While multi-modal data does contain distinct biological information that can be useful for answering complex biology questions, its value for predicting clinical phenotypes and contributions of each type of input remain unknown. We obtained 47 datasets/predictive tasks that in total span over 9 data modalities and executed analytic experiments for predicting various clinical phenotypes and outcomes. First, we analyzed each modality separately using uni-modal approaches based on several state-of-the-art supervised classification and feature selection methods. Then, we applied integrative multi-modal classification techniques. We have found that gene expression is the most predictively informative modality. Other modalities such as protein expression, miRNA expression, and DNA methylation also provide highly predictive results, which are often statistically comparable but not superior to gene expression data. Integrative multi-modal analyses generally do not increase predictive signal compared to gene expression data.
2014
4
1
10
Information content and analysis methods for multi-modal high-throughput biomedical data / Ray, B; Henaff, M; Ma, S; Efstathiadis, E; Peskin, Er; Picone, Marco; Poli, Tito; Aliferis, Cf; Statnikov, A.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 4:(2014), pp. 1-10. [10.1038/srep04411]
Ray, B; Henaff, M; Ma, S; Efstathiadis, E; Peskin, Er; Picone, Marco; Poli, Tito; Aliferis, Cf; Statnikov, A.
File in questo prodotto:
File Dimensione Formato  
Nature Scientific Reports 2014.pdf

Open access

Tipologia: Versione pubblicata dall'editore
Dimensione 882.42 kB
Formato Adobe PDF
882.42 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1198854
Citazioni
  • ???jsp.display-item.citation.pmc??? 14
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 22
social impact