An improved data analysis method is described for rapid identification of intact microorganisms from MALDI-TOF-MS data. The method makes no use of mass spectral fingerprints. Instead, a microorganism database is automatically generated that contains biomarker masses derived from ribosomal protein sequences and a model of N-terminal Met loss. We quantitatively validate the method via a blind study that seeks to identify microorganisms with known ribosomal protein sequences. We also include in the database microorganisms with incompletely known sets of ribosomal proteins to test the specificity of the method. With an optimal MALDI protocol, and at the 95% confidence level, microorganisms represented in the database with 20 or more biomarkers (i.e., those with complete or nearly completely sequenced genomes) are correctly identified from their spectra 100% of the time, with no incorrect identifications. Microorganisms with seven or less biomarkers (i.e., incompletely sequenced genomes) are either not identified or misidentified. Robustness with respect to variations in sample preparation protocol and mass analysis protocol is demonstrated by collecting data with two different matrixes and under two different ion-mode configurations. Statistical analysis suggests that, even without further improvement, the method described here would successfully scale up to microorganism databases with roughly 1000 microorganisms. The results demonstrate that microorganism identification based on proteome data and modeling can perform as well as methods based on mass spectral fingerprinting.
ASJC Scopus subject areas
- Analytical Chemistry