Skip to main navigation Skip to search Skip to main content

Pitfalls of bacterial pan-genome analysis approaches: a case study of Mycobacterium tuberculosis and two less clonal bacterial species

  • MG Marin
  • , N Quinones-Olvera
  • , C Wippel
  • , M Behruznia
  • , BM Jeffrey
  • , M Harris
  • , BC Mann
  • , A Rosenthal
  • , KR Jacobson
  • , RM Warren
  • , H Li
  • , CJ Meehan
  • , MR Farhat

Research output: Contribution to journalA1: Peer-reviewed journal articlespeer-review

Abstract

SUMMARY: Pan-genome analysis is a fundamental tool for studying bacterial genome evolution; however, the variety in methods used to define and measure the pan-genome poses challenges to the interpretation and reliability of results. Using Mycobacterium tuberculosis, a clonally evolving bacterium with a small accessory genome, as a model system, we systematically evaluated sources of variability in pan-genome estimates. Our analysis revealed that differences in assembly type (short-read versus hybrid), annotation pipeline, and pan-genome software, significantly impact predictions of core and accessory genome size. Extending our analysis to two additional bacterial species, Escherichia coli and Staphylococcus aureus, we observed consistent tool-dependent biases but species-specific patterns in pan-genome variability. Our findings highlight the importance of integrating nucleotide- and protein-level analyses to improve the reliability and reproducibility of pan-genome studies across diverse bacterial populations.

AVAILABILITY AND IMPLEMENTATION: Panqc is freely available under an MIT license at https://github.com/maxgmarin/panqc.

Original languageEnglish
Article numberbtaf219
JournalBioinformatics
Volume41
Issue number5
Number of pages13
ISSN1367-4803
DOIs
Publication statusPublished - 6-May-2025

Keywords

  • Escherichia coli/genetics
  • Genome, Bacterial
  • Genomics/methods
  • Mycobacterium tuberculosis/genetics
  • Software
  • Staphylococcus aureus/genetics

Fingerprint

Dive into the research topics of 'Pitfalls of bacterial pan-genome analysis approaches: a case study of Mycobacterium tuberculosis and two less clonal bacterial species'. Together they form a unique fingerprint.

Cite this