Monika Cechova

Monika Cechova


Monika Cechova received her Ph.D. in Biology at Penn State where she studied the Y chromosome evolution in great apes, as well as applied long reads to explore heterochromatin in great apes under the supervision of prof. Kateryna Makova. She’s interested in sex chromosomes, satellite biology, non-B DNA, and reproductive biology, as well as advances in the world of long reads and assembly. Her background in computer science positions her research at the interface of biology and bioinformatics.


  • Satellite Biology and Heterochromatin
  • Sex Chromosomes
  • Long reads and complete genomes
  • Early Embryonic Development
  • Reproductive Biology


  • PhD Major in Biology, Minor in Statistics, 2020

    Penn State, USA

  • MS in Bioinformatics, 2013

    Masaryk University, Brno

  • BS in Applied Informatics, 2011

    Masaryk University, Brno






3 years


5 years


9 years




Department of Biomolecular Engineering, University of California, Santa Cruz

Aug 2022 – Present Santa Cruz
  • T2T and HPRC consortia
  • Gaining new understanding of the satellite DNA


Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University

Jun 2021 – Jul 2022 Brno
  • Developing new algorithms, tools, and methods for bioinformatics
  • Gaining new understanding of the repetitive DNA


Institute of Animal Physiology and Genetics CAS, v. v. i. Central European Institute of Technology; Department of Genetics and Reproduction, Veterinary Research Institute

Mar 2020 – Dec 2020 Brno
  • Early Embryonic Development
  • Spindle Assembly Checkpoint

Graduate Student

Penn State

Aug 2013 – May 2020 State College, PA, USA
  • Studied driving forces of Y chromosome evolution in great apes (Cechova, Vegesna et al. 2020)
  • Explored evolution of heterochromatin in great apes (Cechova et al., 2019)
  • Characterized genome-wide effects of non-B DNA on polymerization speed and error rate (Guiblet et al., 2018)
  • Developed algorithms for the Y chromosome assembly (Rangavittal et al., 2018)
  • Characterized genes, repeats and palindromes on gorilla Y chromosome (Tomaszkiewicz, Rangavittal, Cechova et al. 2016)
  • Developed hybrid genome assembly algorithms for combining short and long reads (Tomaszkiewicz, Rangavittal, Cechova et al. 2016)


Institute of Biophysics, Academy of Sciences of the Czech Republic

Apr 2011 – Aug 2013 Brno
  • Developed pipelines for detection of sex-linked genes from NGS data (Cechova et al., 2015)
  • Characterized nupts and numts in 6 plant species, their age, length, distribution, consequences of insertions, etc. (Cechova et al., 2013)
  • Explored microsatellites-TEs association and microsatellite periodicity (Kejnovsky et al., 2013)


The program of support for promising postdoctoral students (PPLZ)

Mohnkern scholarship

Hill-Hill Memorial Fund Fellowship

Troxell Memorial Scholarship in Biology

CBIOS (NIH T32 Predoctoral Training Grant)

Braddock Scholarship

Best Poster Award, Genetics Conference, Lednice

Recent Publications

Quickly discover relevant content by filtering publications.

HiC TE, a computational pipeline for HiC data analysis to study the role of repeat family interactions in the genome 3D organization

The role of repetitive DNA in the 3D organization of the interphase nucleus is a subject of intensive study. In studies of 3D nucleus organization, mutual contacts of various loci can be identified by HiC sequencing. Typical analyses use binning of read pairs by location to reduce noise. We use binning by repeat families instead to make similar conclusions about repeat regions. To achieve this, we combined HiC data, reference genome data and tools for repeat analysis into a Nextflow pipeline identifying and quantifying the contacts of specific repeat families. As an output, our pipeline produces heatmaps showing contact frequency and circular diagrams visualizing repeat contact localization. Using our pipeline with tomato data, we revealed the preferential homotypic interactions of ribosomal DNA, centromeric satellites and some LTR retrotransposon families and, as expected, little contact between organellar and nuclear DNA elements. While the pipeline can be applied to any eukaryotic genome, results in plants provide better coverage, since the built in TE greedy nester software only detects tandems and LTR retrotransposons. Other repeats can be fed via GFF3 files. This pipeline represents a novel and reproducible way to analyze the role of repetitive elements in the 3D organization of genomes.

Probably Correct -- Rescuing Repeats with Short and Long Reads

Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome—estimated 50–69%—is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from telomere to telomere. Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.


  • Brno, Czechia 61500
  • DM Me