High-quality Genome Assemblies and Biosynthetic Gene Clusters Annotation from Laboratory Reference Fungal Strains
World Microbe Forum
Virtual Event
June 21, 2021Abstract
Fungi are a diverse group of eukaryotes that play an active role in human health and disease, as well as provide an essential source of bioactive compounds with potential therapeutic or industrial applications. While technical advancements and cost reductions for whole-genome sequencing have increased the availability of fungal reference genomes, there remains a significant gap in the availability of fungal genomics data when compared to other microbial groups. A comparative evaluation of fungal genome assemblies produced by ATCC with publicly available genomes for the same organisms show that sequences in public databases are often incomplete, highly fragmented, and/or missing critical metadata.
In this study, we applied dual-platform whole genome sequencing and hybrid assembly to produce high-fidelity, authenticated reference genomes for 74 fungi held within ATCC’s fungal collection. Our bioinformatics pipeline also included end-to-end whole genome annotation designed to identify and catalog biosynthetic gene clusters (BGCs) linked to the production of secondary metabolites and antimicrobial compounds. Our initial foray into fungal genomics represents 74 fungal strains from among 54 species and 21 families. All of these results are available via the ATCC Genome Portal (https://genomes.atcc.org) for use by the research community.
Our sequencing and bioinformatics strategy produced assemblies consistently, superior in contiguity and completeness compared to fungal genomes reported in public databases. The quality of the data is reported for each strain’s sequencing quality metrics, which include coverage, Q-score, assembly level, length, number of scaffolds, N50, GC bias, ploidy, and functional gene annotations. Genomic annotation and data mining yielded known biosynthetic gene clusters (BGCs) involved in biosynthesis of primary and secondary metabolites, including those with predicted antibiotic potential. In addition, whole genome comparison of the number and types of BGCs found in the genomes of closely related strains revealed a diverse and understudied repertoire of analog pathways and mutations.
In the future, BGCs annotations and predicted functional classifications will be integrated into the ATCC Genome Portal’s complement of existing genome references and metadata, thereby further increasing the utility of this platform for the broader fungal functional genomics and natural products research community.
Download the poster to explore the generation of high-quality genome assemblies from fungal strains
DownloadWatch the poster presentation
Presenter
Ford Combs, PhD
Bioinformatician, Sequencing and Bioinformatics Center, ATCC
Ford Combs is a new member of ATCC's Sequencing and Bioinformatics Center, having joined in January 2021. As a bioinformatician, he primarily works on ATCC's internal sequencing projects by either assembling and analyzing data or testing and improving bioinformatics pipelines. As the Audio Engineer on ATCC's Podcast, Behind the Biology, Ford performs sound design and audio editing. He holds an MS and PhD in bioinformatics and computational biology from George Mason University. His dissertation focused on topological and machine learning-based approaches to protein secondary structure assignment.
Reference-quality sequences
Through the ATCC Genome Portal, you can easily search, access, and analyze thousands of reference-quality genome sequences. Our optimized methodology is designed to achieve complete, circularized (when biologically appropriate), and contiguous genomic elements by using short-read (virology collection) and hybrid (bacteriology, mycology, and protistology collections) assembly techniques. We then take our workflow one step further by accompanying each stage of the process with rigorous quality control analyses that ensure the highest quality data. Only the data that passes all quality control criteria are published to the ATCC Genome Portal. Visit the portal today to find the high-quality data you need for your research.
Visit the portal