HiC TE, a computational pipeline for HiC data analysis to study the role of repeat family interactions in the genome 3D organization

Abstract

The role of repetitive DNA in the 3D organization of the interphase nucleus is a subject of intensive study. In studies of 3D nucleus organization, mutual contacts of various loci can be identified by Hi-C sequencing. Typical analyses use binning of read pairs by location to reduce noise. We use binning by repeat families instead to make similar conclusions about repeat regions. To achieve this, we combined Hi-C data, reference genome data and tools for repeat analysis into a Nextflow pipeline identifying and quantifying the contacts of specific repeat families. As an output, our pipeline produces heatmaps showing contact frequency and circular diagrams visualizing repeat contact localization. Using our pipeline with tomato data, we revealed the preferential homotypic interactions of ribosomal DNA, centromeric satellites and some LTR retrotransposon families and, as expected, little contact between organellar and nuclear DNA elements. While the pipeline can be applied to any eukaryotic genome, results in plants provide better coverage, since the built-in TE-greedy-nester software only detects tandems and LTR retrotransposons. Other repeats can be fed via GFF3 files. This pipeline represents a novel and reproducible way to analyze the role of repetitive elements in the 3D organization of genomes.

Publication
genes

#Click the Cite button above to demo the feature to enable visitors to import #publication metadata into their reference management software. #

#Click the Slides button above to demo Academic’s Markdown slides feature. #

#Supplementary notes can be added here, including code and #math.

Monika Cechova
Monika Cechova

My research interests include distributed robotics, mobile computing and programmable matter.

Related