Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of protein fusion transcripts of developing into a cancer is a very challenging and yet not fully explored research problem. To this date, all the available approaches in literature try to explain the oncogenic potential of gene fusions based on protein domain analysis, that is cancer-specific and not easy to adapt to newly developed information. In our work, we choose the raw protein sequences as the input baseline, and propose the use of deep learning, and more specifically Convolutional Neural Networks, to infer the oncogenity probability score of gene fusion transcripts and to group them into a number of categories (e.g., oncogenic/not oncogenic). This is an inherently flexible methodology that, unlike previous approaches, can be re-trained with very less efforts on newly available data (for example, from a different cancer). Based on experimental results on a large dataset of pre-annotated gene fusions, our method is able to predict the oncogenity potential of gene fusion transcripts with accuracy of about 72%, which increases to 86% if we consider the only instances that are classified with a high confidence level.

A Deep Learning Approach to the Screening of Oncogenic Gene Fusions in Humans / Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Di Cataldo, Santa; Ficarra, Elisa. - In: INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES. - ISSN 1422-0067. - 20:7(2019), pp. 1-13. [10.3390/ijms20071645]

A Deep Learning Approach to the Screening of Oncogenic Gene Fusions in Humans

Lovino, Marta;Ficarra, Elisa
2019

Abstract

Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of protein fusion transcripts of developing into a cancer is a very challenging and yet not fully explored research problem. To this date, all the available approaches in literature try to explain the oncogenic potential of gene fusions based on protein domain analysis, that is cancer-specific and not easy to adapt to newly developed information. In our work, we choose the raw protein sequences as the input baseline, and propose the use of deep learning, and more specifically Convolutional Neural Networks, to infer the oncogenity probability score of gene fusion transcripts and to group them into a number of categories (e.g., oncogenic/not oncogenic). This is an inherently flexible methodology that, unlike previous approaches, can be re-trained with very less efforts on newly available data (for example, from a different cancer). Based on experimental results on a large dataset of pre-annotated gene fusions, our method is able to predict the oncogenity potential of gene fusion transcripts with accuracy of about 72%, which increases to 86% if we consider the only instances that are classified with a high confidence level.
2019
20
7
1
13
A Deep Learning Approach to the Screening of Oncogenic Gene Fusions in Humans / Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Di Cataldo, Santa; Ficarra, Elisa. - In: INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES. - ISSN 1422-0067. - 20:7(2019), pp. 1-13. [10.3390/ijms20071645]
Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Di Cataldo, Santa; Ficarra, Elisa
File in questo prodotto:
File Dimensione Formato  
ijms-20-01645-v2.pdf

Open access

Tipologia: Versione pubblicata dall'editore
Dimensione 393.16 kB
Formato Adobe PDF
393.16 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1240337
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 9
social impact