Using the transcriptome to annotate the genome

Saurabh Saha, Andrew B. Sparks, Carlo Rago, Viatcheslav Akmaev, Clarence J. Wang, Bert Vogelstein, Kenneth W. Kinzler, Victor E. Velculescu

Research output: Contribution to journalArticlepeer-review

476 Scopus citations


A remaining challenge for the human genome project involves the identification and annotation of expressed genes. The public and private sequencing efforts have identified ∼15,000 sequences that meet stringent criteria for genes, such as correspondence with known genes from humans or other species, and have made another ∼10,000-20,000 gene predictions of lower confidence, supported by various types of in silico evidence, including homology studies, domain searches, and ab initio gene predictions. These computational methods have limitations, both because they are unable to identify a significant fraction of genes and exons and because they are unable to provide definitive evidence about whether a hypothetical gene is actually expressed. As the in silico approaches identified a smaller number of genes than anticipated, we wondered whether high-throughput experimental analyses could be used to provide evidence for the expression of hypothetical genes and to reveal previously undiscovered genes. We describe here the development of such a method - called long serial analysis of gene expression (LongSAGE), an adaption of the original SAGE approach - that can be used to rapidly identify novel genes and exons.

Original languageEnglish (US)
Pages (from-to)508-512
Number of pages5
JournalNature biotechnology
Issue number5
StatePublished - 2002

ASJC Scopus subject areas

  • Biotechnology
  • Bioengineering
  • Biomedical Engineering
  • Applied Microbiology and Biotechnology
  • Molecular Medicine


Dive into the research topics of 'Using the transcriptome to annotate the genome'. Together they form a unique fingerprint.

Cite this