In the epigenetics field, large-scale functional genomics datasets of ever-increasing size and complexity have been produced using experimental techniques based on high-throughput sequencing. In particular, the study of the 3D organization of chromatin has raised increasing interest, thanks to the development of advanced experimental techniques. In this context, Hi-C has been widely adopted as a high-throughput method to measure pairwise contacts between virtually any pair of genomic loci, thus yielding unprecedented challenges for analyzing and handling the resulting complex datasets. In this review, we focus on the increasing complexity of available Hi-C datasets, which parallels the adoption of novel protocol variants. We also review the complexity of the multiple data analysis steps required to preprocess Hi-C sequencing reads and extract biologically meaningful information. Finally, we discuss solutions for handling and visualizing such large genomics datasets.
Hi-C analysis: from data generation to integration / Pal, Koustav; Forcato, Mattia; Ferrari, Francesco. - In: BIOPHYSICAL REVIEWS. - ISSN 1867-2450. - 11:1(2019), pp. 67-78. [10.1007/s12551-018-0489-1]
Hi-C analysis: from data generation to integration
Forcato, Mattia;
2019
Abstract
In the epigenetics field, large-scale functional genomics datasets of ever-increasing size and complexity have been produced using experimental techniques based on high-throughput sequencing. In particular, the study of the 3D organization of chromatin has raised increasing interest, thanks to the development of advanced experimental techniques. In this context, Hi-C has been widely adopted as a high-throughput method to measure pairwise contacts between virtually any pair of genomic loci, thus yielding unprecedented challenges for analyzing and handling the resulting complex datasets. In this review, we focus on the increasing complexity of available Hi-C datasets, which parallels the adoption of novel protocol variants. We also review the complexity of the multiple data analysis steps required to preprocess Hi-C sequencing reads and extract biologically meaningful information. Finally, we discuss solutions for handling and visualizing such large genomics datasets.File | Dimensione | Formato | |
---|---|---|---|
Pal_etal_BiophysicalReviews_pre-print.pdf
Open access
Tipologia:
Versione originale dell'autore proposta per la pubblicazione
Dimensione
319.81 kB
Formato
Adobe PDF
|
319.81 kB | Adobe PDF | Visualizza/Apri |
s12551-018-0489-1.pdf
Open access
Tipologia:
Versione pubblicata dall'editore
Dimensione
639.98 kB
Formato
Adobe PDF
|
639.98 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris