American Type Culture Collection (ATCC) Logo American Type Culture Collection (ATCC) Logo 0
  • Quick Order
  • Careers
  • Support

Basecalling – How Good is Good Enough?

3D rendering of DNA with rows of ones and zeros across the image.

Abstract

With the release of the R10 flow cells in 2019 and the Dorado basecaller in 2023, Oxford Nanopore Technology® long-read sequencing has displayed a significant improvement in accuracy. Since then, there have been studies investigating the basecalling runtime, output read quality, methylation detection accuracy, and nanopore-only assembly quality. However, these studies have been limited to small quantities of organisms and have not explored the general use cases that may apply to many laboratories. How much better performance does dorado offer over guppy at the various accuracy modes in relation to the time spent basecalling for as many diverse organism types as possible? While at the time of writing this abstract, the analysis is ongoing, the study described here will evaluate 283 different Oxford Nanopore Technologies® GridION® sequencing samples across bacterial, fungal, and viral species in terms of basecalling speed and average read quality for both guppy and dorado with each using the fast, hac, and sup accuracy modes. Further, both nanopore-only and Illumina®/ONT® hybrid assemblies from the resultant 6 collections of data will be evaluated for completion with respect to taxonomic domain, GC content, assembly N50, genome size, replicon count, small variants, and gene content. To date, this represents the first time such a broad and multi-method study has been performed; further underscored that all sequencing was performed by the same user, instrument, chemistry, flow cell, laboratory, and physical basecalling hardware. This study aims to guide sequencing laboratories on the effects of the options available. 

Download the presentation to learn how ATCC's collection offers unparalleled insight into Nanopore sequencing efficiency.

Download

Presenter

David Yarmosh, headshot.

David Yarmosh, MS

Lead Bioinformatician, ATCC

David Yarmosh is a lead bioinformatician in ATCC’s Sequencing and Bioinformatics Center. He’s a graduate of New York University’s Tandon School of Engineering. He has been working in large data aggregation and analysis since 2013 and microbial genomics with a focus on biosurveillance R&D efforts since 2016. David has led international training exercises in Peru and Senegal, sharing metagenomic analytical capabilities. His interests include genomics database construction, metadata collection, drug resistance mechanisms, bioinformatics standards, and machine learning. Since joining ATCC in 2020, David has worked extensively in SARS-CoV-2 classification, epidemiology, and genomics evaluation, including enhanced and uniform variant reporting. He has contributed more broadly to genomics reporting and analytical standardization and he has helped develop the podcast Behind the Biology, which he now hosts.

DNA rods with bacteria.

Reference-quality sequences

Through the ATCC Genome Portal, you can easily search, access, and analyze thousands of reference-quality genome sequences. Our optimized methodology is designed to achieve complete, circularized (when biologically appropriate), and contiguous genomic elements by using short-read (virology collection) and hybrid (bacteriology, mycology, and protistology collections) assembly techniques. We then take our workflow one step further by accompanying each stage of the process with rigorous quality control analyses that ensure the highest quality data. Only the data that passes all quality control criteria are published to the ATCC Genome Portal. Visit the portal today to find the high-quality data you need for your research.

Visit the portal