Cataloging of Antimicrobial Resistance Genes and Biosynthetic Gene Clusters in the ATCC Global Priority Superbugs Collection

Cluster of fluorescent pink rods of bacteria.


Background: Microorganisms are an important source of bioactive metabolites essential for drug development and industrial processes, but as antimicrobial resistant (AMR) strains continue to spread, they also remain a major concern for global public health. Advances in DNA sequencing technologies and bioinformatics have enabled in-depth studies of these organisms by providing powerful tools for rapidly identifying genes essential to biosynthesis and resistance. To correctly identify these genes, researchers require access to rigorously validated reference genomes. However, publicly available genomic databases frequently exhibit sequence and metadata errors, are not easy to use, or lack the capability to search and identify authenticated microbial strains containing relevant AMR markers and biosynthesis gene clusters (BGCs).

Methods: We are addressing these challenges by generating gold-standard reference genomes from our extensive portfolio of microbial strains, which are available through a user-friendly genome portal ( Here, we sequenced the complete genomes of 148 novel multidrug-resistant bacterial strains from clinical cases and evaluated their minimum inhibitory concentration against a set of 20 antibiotics.

Results: The resulting data were used to create an atlas of AMR genes, including several extended-spectrum beta-lactamases, which are associated with higher generation carbapenem and cephalosporin resistance mechanisms, as well as several BGCs like nonribosomal peptide synthetases, aryl polyene, and β-lactones which are present in pathogenic bacteria and are associated with antibiotic activity. By mining the genomic data, we were able to classify AMR genes and BGCs and connect them to pathways previously characterized in external antimicrobial and secondary metabolites databases. Then, known and unknown resistance mechanisms and compound types associated with specific pathways were evaluated and organized based on source, collection, taxonomy, gene name, cluster type, gene activity, and mutations. To ensure that these results meet the FAIR Data Principles (Findable, Accessible, Interoperable, Usable), we are integrating these results within the existing ATCC Genome Portal.

Conclusions: Overall, these reference-quality genomes and associated antibiotic susceptibility testing data will enable researchers to identify and compare strains containing relevant AMR markers and BGCs of interest, while maintaining a high level of confidence in the authenticity of the data and its connection to physical isolates from which it was derived.


Ford Combs, headshot.

Ford Combs, MS

Bioinformatician, Sequencing and Bioinformatics Center, ATCC

Ford Combs is a new member of ATCC's Sequencing and Bioinformatics Center, having joined in January 2021. As a bioinformatician, he primarily works on ATCC's internal sequencing projects by either assembling and analyzing data or testing and improving bioinformatics pipelines. He holds an MS in bioinformatics and computational biology from George Mason University and is currently completing his PhD there. His dissertation is on protein secondary structure assignment using topology and machine learning.