A multi-sample approach increases the accuracy of transcript assembly

Li Song, Sarven Sabunciyan, Guangyu Yang, Liliana Florea

Research output: Contribution to journalArticle

Abstract

Transcript assembly from RNA-seq reads is a critical step in gene expression and subsequent functional analyses. Here we present PsiCLASS, an accurate and efficient transcript assembler based on an approach that simultaneously analyzes multiple RNA-seq samples. PsiCLASS combines mixture statistical models for exonic feature selection across multiple samples with splice graph based dynamic programming algorithms and a weighted voting scheme for transcript selection. PsiCLASS achieves significantly better sensitivity-precision tradeoff, and renders precision up to 2-3 fold higher than the StringTie system and Scallop plus TACO, the two best current approaches. PsiCLASS is efficient and scalable, assembling 667 GEUVADIS samples in 9 h, and has robust accuracy with large numbers of samples.

Original languageEnglish (US)
Article number5000
JournalNature communications
Volume10
Issue number1
DOIs
StatePublished - Dec 1 2019

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Fingerprint Dive into the research topics of 'A multi-sample approach increases the accuracy of transcript assembly'. Together they form a unique fingerprint.

  • Cite this