Complex Sources of Variation in Tissue Expression Data: Analysis of the GTEx Lung Transcriptome

Matthew N N. McCall, Peter B B. Illei, Marc K K. Halushka

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

The sources of gene expression variability in human tissues are thought to be a complex interplay of technical, compositional, and disease-related factors. To better understand these contributions, we investigated expression variability in a relatively homogeneous tissue expression dataset from the Genotype-Tissue Expression (GTEx) resource. In addition to identifying technical sources, such as sequencing date and post-mortem interval, we also identified several biological sources of variation. An in-depth analysis of the 175 genes with the greatest variation among 133 lung tissue samples identified five distinct clusters of highly correlated genes. One large cluster included surfactant genes (SFTPA1, SFTPA2, and SFTPC), which are expressed exclusively in type II pneumocytes, cells that proliferate in ventilator associated lung injury. High surfactant expression was strongly associated with death on a ventilator and type II pneumocyte hyperplasia. A second large cluster included dynein (DNAH9 and DNAH12) and mucin (MUC5B and MUC16) genes, which are exclusive to the respiratory epithelium and goblet cells of bronchial structures. This indicates heterogeneous bronchiole sampling due to the harvesting location in the lung. A small cluster included acute-phase reactant genes (SAA1, SAA2, and SAA2–SAA4). The final two small clusters were technical and gender related. To summarize, in a collection of normal lung samples, we found that tissue heterogeneity caused by harvesting location (medial or lateral lung) and late therapeutic intervention (mechanical ventilation) were major contributors to expression variation. These unexpected sources of variation were the result of altered cell ratios in the tissue samples, an underappreciated source of expression variation.

Original languageEnglish (US)
Pages (from-to)624-635
Number of pages12
JournalAmerican journal of human genetics
Volume99
Issue number3
DOIs
StatePublished - Sep 1 2016

Keywords

  • GTEx
  • gene expression
  • heterogeneity
  • lung
  • type II pneumocytes
  • variance

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'Complex Sources of Variation in Tissue Expression Data: Analysis of the GTEx Lung Transcriptome'. Together they form a unique fingerprint.

Cite this