Genome sequence assembly

Algorithms and issues

Mihai Pop, Steven L Salzberg, Martin Shumway

Research output: Contribution to journalArticle

Abstract

Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can include billions of nucleotides, the chemical reactions researchers use to decode the DNA are accurate for only about 600 to 700 nucleotides at a time. The DNA reads that sequencing produces must then be assembled into a complete picture of the genome. Errors and certain DNA characteristics complicate assembly. Resolving these problems entails an additional and costly finishing phase that involves extensive human intervention. Assembly programs can dramatically reduce this cost by taking into account additional information obtained during finishing. Algorithms that can assemble millions of DNA fragments into gene sequences underlie the current revolution in biotechnology, helping researchers build the growing database of complete genomes.

Original languageEnglish (US)
JournalComputer
Volume35
Issue number7
DOIs
StatePublished - Jul 2002
Externally publishedYes

Fingerprint

Genes
DNA
Nucleotides
Program assemblers
DNA sequences
Biotechnology
Chemical reactions
Automation
Costs

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Genome sequence assembly : Algorithms and issues. / Pop, Mihai; Salzberg, Steven L; Shumway, Martin.

In: Computer, Vol. 35, No. 7, 07.2002.

Research output: Contribution to journalArticle

Pop, Mihai ; Salzberg, Steven L ; Shumway, Martin. / Genome sequence assembly : Algorithms and issues. In: Computer. 2002 ; Vol. 35, No. 7.
@article{1c4eae57a29d4964b79122ce5692a6f0,
title = "Genome sequence assembly: Algorithms and issues",
abstract = "Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can include billions of nucleotides, the chemical reactions researchers use to decode the DNA are accurate for only about 600 to 700 nucleotides at a time. The DNA reads that sequencing produces must then be assembled into a complete picture of the genome. Errors and certain DNA characteristics complicate assembly. Resolving these problems entails an additional and costly finishing phase that involves extensive human intervention. Assembly programs can dramatically reduce this cost by taking into account additional information obtained during finishing. Algorithms that can assemble millions of DNA fragments into gene sequences underlie the current revolution in biotechnology, helping researchers build the growing database of complete genomes.",
author = "Mihai Pop and Salzberg, {Steven L} and Martin Shumway",
year = "2002",
month = "7",
doi = "10.1109/MC.2002.1016901",
language = "English (US)",
volume = "35",
journal = "ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering",
issn = "0018-9162",
publisher = "IEEE Computer Society",
number = "7",

}

TY - JOUR

T1 - Genome sequence assembly

T2 - Algorithms and issues

AU - Pop, Mihai

AU - Salzberg, Steven L

AU - Shumway, Martin

PY - 2002/7

Y1 - 2002/7

N2 - Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can include billions of nucleotides, the chemical reactions researchers use to decode the DNA are accurate for only about 600 to 700 nucleotides at a time. The DNA reads that sequencing produces must then be assembled into a complete picture of the genome. Errors and certain DNA characteristics complicate assembly. Resolving these problems entails an additional and costly finishing phase that involves extensive human intervention. Assembly programs can dramatically reduce this cost by taking into account additional information obtained during finishing. Algorithms that can assemble millions of DNA fragments into gene sequences underlie the current revolution in biotechnology, helping researchers build the growing database of complete genomes.

AB - Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can include billions of nucleotides, the chemical reactions researchers use to decode the DNA are accurate for only about 600 to 700 nucleotides at a time. The DNA reads that sequencing produces must then be assembled into a complete picture of the genome. Errors and certain DNA characteristics complicate assembly. Resolving these problems entails an additional and costly finishing phase that involves extensive human intervention. Assembly programs can dramatically reduce this cost by taking into account additional information obtained during finishing. Algorithms that can assemble millions of DNA fragments into gene sequences underlie the current revolution in biotechnology, helping researchers build the growing database of complete genomes.

UR - http://www.scopus.com/inward/record.url?scp=0036644865&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036644865&partnerID=8YFLogxK

U2 - 10.1109/MC.2002.1016901

DO - 10.1109/MC.2002.1016901

M3 - Article

VL - 35

JO - ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering

JF - ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering

SN - 0018-9162

IS - 7

ER -