Polyester: Simulating RNA-seq datasets with differential transcript expression

Alyssa C. Frazee, Andrew E. Jaffe, Ben Langmead, Jeffrey T. Leek

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Results: Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user.

Original languageEnglish (US)
Pages (from-to)2778-2784
Number of pages7
JournalBioinformatics
Volume31
Issue number17
DOIs
StatePublished - Feb 6 2015

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Polyester: Simulating RNA-seq datasets with differential transcript expression'. Together they form a unique fingerprint.

Cite this