A statistical framework for expression-based molecular classification in cancer

Giovanni Parmigiani, Elizabeth S. Garrett, Ramaswamy Anbazhagan, Edward Gabrielson

Research output: Contribution to journalArticlepeer-review


Genome-wide measurement of gene expression is a promising approach to the identification of subclasses of cancer that are currently not differentiable, but potentially biologically heterogeneous. This type of molecular classification gives hope for highly individualized and more effective prognosis and treatment of cancer. Statistically, the analysis of gene expression data from unclassified tumours is a complex hypothesis-generating activity, involving data exploration, modelling and expert elicitation. We propose a modelling framework that can be used to inform and organize the development of exploratory tools for classification. Our framework uses latent categories to provide both a statistical definition of differential expression and a precise, experiment-independent, definition of a molecular profile. It also generates natural similarity measures for traditional clustering and gives probabilistic statements about the assignment of tumours to molecular profiles.

Original languageEnglish (US)
Pages (from-to)717-736
Number of pages20
JournalJournal of the Royal Statistical Society. Series B: Statistical Methodology
Issue number4
StatePublished - 2002


  • Microarray data analysis
  • Mixture distributions
  • Molecular classification of cancer

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'A statistical framework for expression-based molecular classification in cancer'. Together they form a unique fingerprint.

Cite this