Monika Cechova

Monika Cechova

Biography

I am interested in the most complex parts of the human genome. I believe complete, Telomere-To-Telomere assemblies are the future of genomics that is happening now.

Interests

  • Long reads and complete T2T genomes
  • Y chromosome, centromeres, telomeres, acrocentric chromosomes
  • Satellite Biology and Heterochromatin
  • Aneuploidies

Education

  • PhD Major in Biology, Minor in Statistics, 2020

    Penn State, USA

  • MS in Bioinformatics, 2013

    Masaryk University, Brno

  • BS in Applied Informatics, 2011

    Masaryk University, Brno

Skills

R

Statistics

Python

Nanopore

6 years

PacBio

8 years

Illumina

12 years

Experience

 
 
 
 
 

Assistant Professor

Faculty of Informatics, Masaryk University, Brno

Feb 2025 – Present Brno
  • Complex parts of human genomes: Y chromosome, acrocentric chromosomes, centromeres, telomeres
  • Aneuploidies, machine learning for genomics
 
 
 
 
 

Postdoc

Department of Biomolecular Engineering, University of California, Santa Cruz

Aug 2022 – Nov 2024 Santa Cruz
  • T2T and HPRC consortia
  • Gaining new understanding of the satellite DNA
 
 
 
 
 

Postdoc

Faculty of Informatics, Masaryk University

Jun 2021 – Jul 2022 Brno
  • Developing new algorithms, tools, and methods for bioinformatics
  • Gaining new understanding of the repetitive DNA
 
 
 
 
 

Postdoc

Institute of Animal Physiology and Genetics CAS, v. v. i. Central European Institute of Technology; Department of Genetics and Reproduction, Veterinary Research Institute

Mar 2020 – Dec 2020 Brno
  • Early Embryonic Development
  • Spindle Assembly Checkpoint
 
 
 
 
 

Graduate Student

Penn State

Aug 2013 – May 2020 State College, PA, USA
  • Studied driving forces of Y chromosome evolution in great apes (Cechova, Vegesna et al. 2020)
  • Explored evolution of heterochromatin in great apes (Cechova et al., 2019)
  • Characterized genome-wide effects of non-B DNA on polymerization speed and error rate (Guiblet et al., 2018)
  • Developed algorithms for the Y chromosome assembly (Rangavittal et al., 2018)
  • Characterized genes, repeats and palindromes on gorilla Y chromosome (Tomaszkiewicz, Rangavittal, Cechova et al. 2016)
  • Developed hybrid genome assembly algorithms for combining short and long reads (Tomaszkiewicz, Rangavittal, Cechova et al. 2016)
 
 
 
 
 

Bioinformatician

Institute of Biophysics, Academy of Sciences of the Czech Republic

Apr 2011 – Aug 2013 Brno
  • Developed pipelines for detection of sex-linked genes from NGS data (Cechova et al., 2015)
  • Characterized nupts and numts in 6 plant species, their age, length, distribution, consequences of insertions, etc. (Cechova et al., 2013)
  • Explored microsatellites-TEs association and microsatellite periodicity (Kejnovsky et al., 2013)

Awards

The program of support for promising postdoctoral students (PPLZ)

Mohnkern scholarship

Hill-Hill Memorial Fund Fellowship

Troxell Memorial Scholarship in Biology

CBIOS (NIH T32 Predoctoral Training Grant)

Braddock Scholarship

Best Poster Award, Genetics Conference, Lednice

Recent Publications

Quickly discover relevant content by filtering publications.

The complete sequence and comparative analysis of ape sex chromosomes

Apes possess two sex chromosomes—the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements—owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.

Accurate sequencing of DNA motifs able to form alternative (non-B) structures

Approximately 13% of the human genome at certain motifs have the potential to form noncanonical (non-B) DNA structures (e.g., G-quadruplexes, cruciforms, and Z-DNA), which regulate many cellular processes but also affect the activity of polymerases and helicases. Because sequencing technologies use these enzymes, they might possess increased errors at non-B structures. To evaluate this, we analyzed error rates, read depth, and base quality of Illumina, Pacific Biosciences (PacBio) HiFi, and Oxford Nanopore Technologies (ONT) sequencing at non-B motifs. All technologies showed altered sequencing success for most non-B motif types, although this could be owing to several factors, including structure formation, biased GC content, and the presence of homopolymers. Single-nucleotide mismatch errors had low biases in HiFi and ONT for all non-B motif types but were increased for G-quadruplexes and Z-DNA in all three technologies. Deletion errors were increased for all non-B types but Z-DNA in Illumina and HiFi, as well as only for G-quadruplexes in ONT. Insertion errors for non-B motifs were highly, moderately, and slightly elevated in Illumina, HiFi, and ONT, respectively. Additionally, we developed a probabilistic approach to determine the number of false positives at non-B motifs depending on sample size and variant frequency, and applied it to publicly available data sets (1000 Genomes, Simons Genome Diversity Project, and gnomAD). We conclude that elevated sequencing errors at non-B DNA motifs should be considered in low-read-depth studies (single-cell, ancient DNA, and pooled-sample population sequencing) and in scoring rare variants. Combining technologies should maximize sequencing accuracy in future studies of non-B DNA.

HiC TE, a computational pipeline for HiC data analysis to study the role of repeat family interactions in the genome 3D organization

The role of repetitive DNA in the 3D organization of the interphase nucleus is a subject of intensive study. In studies of 3D nucleus organization, mutual contacts of various loci can be identified by HiC sequencing. Typical analyses use binning of read pairs by location to reduce noise. We use binning by repeat families instead to make similar conclusions about repeat regions. To achieve this, we combined HiC data, reference genome data and tools for repeat analysis into a Nextflow pipeline identifying and quantifying the contacts of specific repeat families. As an output, our pipeline produces heatmaps showing contact frequency and circular diagrams visualizing repeat contact localization. Using our pipeline with tomato data, we revealed the preferential homotypic interactions of ribosomal DNA, centromeric satellites and some LTR retrotransposon families and, as expected, little contact between organellar and nuclear DNA elements. While the pipeline can be applied to any eukaryotic genome, results in plants provide better coverage, since the built in TE greedy nester software only detects tandems and LTR retrotransposons. Other repeats can be fed via GFF3 files. This pipeline represents a novel and reproducible way to analyze the role of repetitive elements in the 3D organization of genomes.

The complete sequence of a human Y chromosome

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1,2,3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

Contact

  • cechova.biomonika@gmail.com
  • Brno, Czechia 61500
  • DM Me