TY - JOUR
T1 - Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics
AU - Myint, Leslie
AU - Kleensang, Andre
AU - Zhao, Liang
AU - Hartung, Thomas
AU - Hansen, Kasper D.
N1 - Funding Information:
Research reported in this publication was supported by National Institute of Environmental Health Sciences of the National Institutes of Health under Award Number R01ES020750 and the National Cancer Institute of the National Institutes of Health under Award Number U24CA180996. This research was supported by a Johns Hopkins Bloomberg School of Public Health Faculty Innovation Fund award. The EARLI study was funded by Grant R01ES016443 and Autism Speaks Grant 9502. Some EARLI participants were recruited with the assistance of the Interactive Autism Network (IAN) database at the Kennedy Krieger Institute, Baltimore MD.
Publisher Copyright:
© 2017 American Chemical Society.
PY - 2017/3/21
Y1 - 2017/3/21
N2 - As mass spectrometry-based metabolomics becomes more widely used in biomedical research, it is important to revisit existing data analysis paradigms. Existing data preprocessing efforts have largely focused on methods which start by extracting features separately from each sample, followed by a subsequent attempt to group features across samples to facilitate comparisons. We show that this preprocessing approach leads to unnecessary variability in peak quantifications that adversely impacts downstream analysis. We present a new method, bakedpi, for the preprocessing of both centroid and profile mode metabolomics data that relies on an intensity-weighted bivariate kernel density estimation on a pooling of all samples to detect peaks. This new method reduces this unnecessary quantification variability and increases power in downstream differential analysis. (Figure Presented).
AB - As mass spectrometry-based metabolomics becomes more widely used in biomedical research, it is important to revisit existing data analysis paradigms. Existing data preprocessing efforts have largely focused on methods which start by extracting features separately from each sample, followed by a subsequent attempt to group features across samples to facilitate comparisons. We show that this preprocessing approach leads to unnecessary variability in peak quantifications that adversely impacts downstream analysis. We present a new method, bakedpi, for the preprocessing of both centroid and profile mode metabolomics data that relies on an intensity-weighted bivariate kernel density estimation on a pooling of all samples to detect peaks. This new method reduces this unnecessary quantification variability and increases power in downstream differential analysis. (Figure Presented).
UR - http://www.scopus.com/inward/record.url?scp=85018765158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85018765158&partnerID=8YFLogxK
U2 - 10.1021/acs.analchem.6b04719
DO - 10.1021/acs.analchem.6b04719
M3 - Article
C2 - 28221771
AN - SCOPUS:85018765158
VL - 89
SP - 3517
EP - 3523
JO - Analytical Chemistry
JF - Analytical Chemistry
SN - 0003-2700
IS - 6
ER -