PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF

Genevieve L. Stein-O'Brien, Jacob L. Carey, Wai Shing Lee, Michael Considine, Alexander Favorov, Emily Flam, Theresa Guo, Sijia Li, Luigi Marchionni, Thomas Sherman, Shawn Sivy, Daria Gaykalova, Ronald D. McKay, Michael F. Ochs, Carlo Colantuoni, Elana Fertig

Research output: Contribution to journalArticle

Abstract

Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data. Availability and Implementation: PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.

Original languageEnglish (US)
Pages (from-to)1892-1894
Number of pages3
JournalBioinformatics
Volume33
Issue number12
DOIs
StatePublished - Jun 15 2017

Fingerprint

Matrix Factorization
Biomarkers
Factorization
Transcriptome
Data-driven
Genome
Genes
Visualization
MCMC Algorithm
Gene
Biological Phenomena
Non-negative Matrix Factorization
Licensure
Datasets
Web Application
Gene Expression
Univariate
Statistic
Signature
Software

ASJC Scopus subject areas

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF. / Stein-O'Brien, Genevieve L.; Carey, Jacob L.; Lee, Wai Shing; Considine, Michael; Favorov, Alexander; Flam, Emily; Guo, Theresa; Li, Sijia; Marchionni, Luigi; Sherman, Thomas; Sivy, Shawn; Gaykalova, Daria; McKay, Ronald D.; Ochs, Michael F.; Colantuoni, Carlo; Fertig, Elana.

In: Bioinformatics, Vol. 33, No. 12, 15.06.2017, p. 1892-1894.

Research output: Contribution to journalArticle

Stein-O'Brien, GL, Carey, JL, Lee, WS, Considine, M, Favorov, A, Flam, E, Guo, T, Li, S, Marchionni, L, Sherman, T, Sivy, S, Gaykalova, D, McKay, RD, Ochs, MF, Colantuoni, C & Fertig, E 2017, 'PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF', Bioinformatics, vol. 33, no. 12, pp. 1892-1894. https://doi.org/10.1093/bioinformatics/btx058
Stein-O'Brien, Genevieve L. ; Carey, Jacob L. ; Lee, Wai Shing ; Considine, Michael ; Favorov, Alexander ; Flam, Emily ; Guo, Theresa ; Li, Sijia ; Marchionni, Luigi ; Sherman, Thomas ; Sivy, Shawn ; Gaykalova, Daria ; McKay, Ronald D. ; Ochs, Michael F. ; Colantuoni, Carlo ; Fertig, Elana. / PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF. In: Bioinformatics. 2017 ; Vol. 33, No. 12. pp. 1892-1894.
@article{a7eb94c3b6624b7eaf53bfc4062724d8,
title = "PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF",
abstract = "Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data. Availability and Implementation: PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.",
author = "Stein-O'Brien, {Genevieve L.} and Carey, {Jacob L.} and Lee, {Wai Shing} and Michael Considine and Alexander Favorov and Emily Flam and Theresa Guo and Sijia Li and Luigi Marchionni and Thomas Sherman and Shawn Sivy and Daria Gaykalova and McKay, {Ronald D.} and Ochs, {Michael F.} and Carlo Colantuoni and Elana Fertig",
year = "2017",
month = "6",
day = "15",
doi = "10.1093/bioinformatics/btx058",
language = "English (US)",
volume = "33",
pages = "1892--1894",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF

AU - Stein-O'Brien, Genevieve L.

AU - Carey, Jacob L.

AU - Lee, Wai Shing

AU - Considine, Michael

AU - Favorov, Alexander

AU - Flam, Emily

AU - Guo, Theresa

AU - Li, Sijia

AU - Marchionni, Luigi

AU - Sherman, Thomas

AU - Sivy, Shawn

AU - Gaykalova, Daria

AU - McKay, Ronald D.

AU - Ochs, Michael F.

AU - Colantuoni, Carlo

AU - Fertig, Elana

PY - 2017/6/15

Y1 - 2017/6/15

N2 - Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data. Availability and Implementation: PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.

AB - Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data. Availability and Implementation: PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.

UR - http://www.scopus.com/inward/record.url?scp=85021369455&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021369455&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btx058

DO - 10.1093/bioinformatics/btx058

M3 - Article

C2 - 28174896

AN - SCOPUS:85021369455

VL - 33

SP - 1892

EP - 1894

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

ER -