Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

Cole Trapnell, Brian A. Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J. Van Baren, Steven L. Salzberg, Barbara J. Wold, Lior Pachter

Research output: Contribution to journalArticlepeer-review

Abstract

High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed <430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

Original languageEnglish (US)
Pages (from-to)511-515
Number of pages5
JournalNature biotechnology
Volume28
Issue number5
DOIs
StatePublished - May 2010
Externally publishedYes

ASJC Scopus subject areas

  • Biotechnology
  • Bioengineering
  • Applied Microbiology and Biotechnology
  • Molecular Medicine
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation'. Together they form a unique fingerprint.

Cite this