Testing the significance of microorganism identification by mass spectrometry and proteome database search

Fernando J Pineda, Jeffrey S. Lin, Catherine Fenselau, Plamen A. Demirev

Research output: Contribution to journalArticle

Abstract

We derive and validate a simple statistical model that predicts the distribution of false matches between peaks in matrix-assisted laser desorption/ionization mass spectrometry data and proteins in proteome databases. The model allows us to calculate the significance of previously reported microorganism identification results. In particular, for Δm = ±1.5 Da, we find that the computed significance levels are sufficient to demonstrate the ability to identify microorganisms, provided the number of candidate microorganisms is limited to roughly three Escherichia coli-like or roughly 10 Bacillus subtilis-like microorganisms (in the sense of having roughly the same number of proteins per unit-mass interval). We conclude that, given the cluttered and incomplete nature of the data, it is likely that neither simple ranking nor simple hypothesis testing will be sufficient for truly robust microorganism identification over a large number of candidate microorganisms.

Original languageEnglish (US)
Pages (from-to)3739-3744
Number of pages6
JournalAnalytical Chemistry
Volume72
Issue number16
DOIs
StatePublished - Aug 15 2000

Fingerprint

Proteome
Microorganisms
Mass spectrometry
Testing
Bacilli
Escherichia coli
Ionization
Desorption
Proteins
Lasers

ASJC Scopus subject areas

  • Analytical Chemistry

Cite this

Testing the significance of microorganism identification by mass spectrometry and proteome database search. / Pineda, Fernando J; Lin, Jeffrey S.; Fenselau, Catherine; Demirev, Plamen A.

In: Analytical Chemistry, Vol. 72, No. 16, 15.08.2000, p. 3739-3744.

Research output: Contribution to journalArticle

Pineda, Fernando J ; Lin, Jeffrey S. ; Fenselau, Catherine ; Demirev, Plamen A. / Testing the significance of microorganism identification by mass spectrometry and proteome database search. In: Analytical Chemistry. 2000 ; Vol. 72, No. 16. pp. 3739-3744.
@article{ff5d7bd3824f4d84a8b503b116531e87,
title = "Testing the significance of microorganism identification by mass spectrometry and proteome database search",
abstract = "We derive and validate a simple statistical model that predicts the distribution of false matches between peaks in matrix-assisted laser desorption/ionization mass spectrometry data and proteins in proteome databases. The model allows us to calculate the significance of previously reported microorganism identification results. In particular, for Δm = ±1.5 Da, we find that the computed significance levels are sufficient to demonstrate the ability to identify microorganisms, provided the number of candidate microorganisms is limited to roughly three Escherichia coli-like or roughly 10 Bacillus subtilis-like microorganisms (in the sense of having roughly the same number of proteins per unit-mass interval). We conclude that, given the cluttered and incomplete nature of the data, it is likely that neither simple ranking nor simple hypothesis testing will be sufficient for truly robust microorganism identification over a large number of candidate microorganisms.",
author = "Pineda, {Fernando J} and Lin, {Jeffrey S.} and Catherine Fenselau and Demirev, {Plamen A.}",
year = "2000",
month = "8",
day = "15",
doi = "10.1021/ac000130q",
language = "English (US)",
volume = "72",
pages = "3739--3744",
journal = "Analytical Chemistry",
issn = "0003-2700",
publisher = "American Chemical Society",
number = "16",

}

TY - JOUR

T1 - Testing the significance of microorganism identification by mass spectrometry and proteome database search

AU - Pineda, Fernando J

AU - Lin, Jeffrey S.

AU - Fenselau, Catherine

AU - Demirev, Plamen A.

PY - 2000/8/15

Y1 - 2000/8/15

N2 - We derive and validate a simple statistical model that predicts the distribution of false matches between peaks in matrix-assisted laser desorption/ionization mass spectrometry data and proteins in proteome databases. The model allows us to calculate the significance of previously reported microorganism identification results. In particular, for Δm = ±1.5 Da, we find that the computed significance levels are sufficient to demonstrate the ability to identify microorganisms, provided the number of candidate microorganisms is limited to roughly three Escherichia coli-like or roughly 10 Bacillus subtilis-like microorganisms (in the sense of having roughly the same number of proteins per unit-mass interval). We conclude that, given the cluttered and incomplete nature of the data, it is likely that neither simple ranking nor simple hypothesis testing will be sufficient for truly robust microorganism identification over a large number of candidate microorganisms.

AB - We derive and validate a simple statistical model that predicts the distribution of false matches between peaks in matrix-assisted laser desorption/ionization mass spectrometry data and proteins in proteome databases. The model allows us to calculate the significance of previously reported microorganism identification results. In particular, for Δm = ±1.5 Da, we find that the computed significance levels are sufficient to demonstrate the ability to identify microorganisms, provided the number of candidate microorganisms is limited to roughly three Escherichia coli-like or roughly 10 Bacillus subtilis-like microorganisms (in the sense of having roughly the same number of proteins per unit-mass interval). We conclude that, given the cluttered and incomplete nature of the data, it is likely that neither simple ranking nor simple hypothesis testing will be sufficient for truly robust microorganism identification over a large number of candidate microorganisms.

UR - http://www.scopus.com/inward/record.url?scp=0034663240&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034663240&partnerID=8YFLogxK

U2 - 10.1021/ac000130q

DO - 10.1021/ac000130q

M3 - Article

VL - 72

SP - 3739

EP - 3744

JO - Analytical Chemistry

JF - Analytical Chemistry

SN - 0003-2700

IS - 16

ER -