ATCC ATCC Logo 0

High-quality Genome Assemblies and Biosynthetic Gene Clusters Annotation from Laboratory Reference Fungal Strains

Poster
Pink, flower-like strands of Aspergillus fumigatus fungus.

World Microbe Forum

Virtual Event

June 21, 2021

Abstract

Fungi are a diverse group of eukaryotes that play an active role in human health and disease, as well as provide an essential source of bioactive compounds with potential therapeutic or industrial applications. While technical advancements and cost reductions for whole-genome sequencing have increased the availability of fungal reference genomes, there remains a significant gap in the availability of fungal genomics data when compared to other microbial groups. A comparative evaluation of fungal genome assemblies produced by ATCC with publicly available genomes for the same organisms show that sequences in public databases are often incomplete, highly fragmented, and/or missing critical metadata. 

In this study, we applied dual-platform whole genome sequencing and hybrid assembly to produce high-fidelity, authenticated reference genomes for 74 fungi held within ATCC’s fungal collection. Our bioinformatics pipeline also included end-to-end whole genome annotation designed to identify and catalog biosynthetic gene clusters (BGCs) linked to the production of secondary metabolites and antimicrobial compounds. Our initial foray into fungal genomics represents 74 fungal strains from among 54 species and 21 families. All of these results are available via the ATCC Genome Portal (https://genomes.atcc.org) for use by the research community. 

Our sequencing and bioinformatics strategy produced assemblies consistently, superior in contiguity and completeness compared to fungal genomes reported in public databases. The quality of the data is reported for each strain’s sequencing quality metrics, which include coverage, Q-score, assembly level, length, number of scaffolds, N50, GC bias, ploidy, and functional gene annotations. Genomic annotation and data mining yielded known biosynthetic gene clusters (BGCs) involved in biosynthesis of primary and secondary metabolites, including those with predicted antibiotic potential. In addition, whole genome comparison of the number and types of BGCs found in the genomes of closely related strains revealed a diverse and understudied repertoire of analog pathways and mutations. 

In the future, BGCs annotations and predicted functional classifications will be integrated into the ATCC Genome Portal’s complement of existing genome references and metadata, thereby further increasing the utility of this platform for the broader fungal functional genomics and natural products research community. 

Download the poster to explore the generation of high-quality genome assemblies from fungal strains

Download

Watch the poster presentation

Presenter

Ford Combs, headshot.

Ford Combs, MS

Bioinformatician, Sequencing and Bioinformatics Center, ATCC

Ford Combs is a new member of ATCC's Sequencing and Bioinformatics Center, having joined in January 2021. As a bioinformatician, he primarily works on ATCC's internal sequencing projects by either assembling and analyzing data or testing and improving bioinformatics pipelines. He holds an MS in bioinformatics and computational biology from George Mason University and is currently completing his PhD there. His dissertation is on protein secondary structure assignment using topology and machine learning.

DNA helix made of green and yellow puffy balls.

Reference-quality sequences

Through the ATCC Genome Portal, you can easily search, access, and analyze hundreds of reference-quality genome sequences. Our optimized methodology is designed to achieve complete, circularized (when biologically appropriate), and contiguous genomic elements by using short-read (viruses) and hybrid (bacteria and fungi) assembly techniques. We then took our workflow one step further by accompanying each stage of the process with rigorous quality control analyses that ensure our data are the highest quality possible. Only the data that passes all quality control criteria are published to the ATCC Genome Portal. Visit the portal today to find the high-quality data you need for your research.

Visit the portal