Alignment of whole genomes

Arthur L. Delcher; Simon Kasif; Robert D. Fleischmann; Jeremy Peterson; Owen White; Steven L. Salzberg

doi:10.1093/nar/27.11.2369

Alignment of whole genomes

Arthur L. Delcher, Simon Kasif, Robert D. Fleischmann, Jeremy Peterson, Owen White, Steven L. Salzberg

Research output: Contribution to journal › Article › peer-review

566 Scopus citations

Abstract

A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.

Original language	English (US)
Pages (from-to)	2369-2376
Number of pages	8
Journal	Nucleic acids research
Volume	27
Issue number	11
DOIs	https://doi.org/10.1093/nar/27.11.2369
State	Published - Jun 1 1999
Externally published	Yes

ASJC Scopus subject areas

Genetics

Access to Document

10.1093/nar/27.11.2369

Cite this

@article{08cd567fc9204d42adeca2f46f843049,

title = "Alignment of whole genomes",

abstract = "A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.",

author = "Delcher, {Arthur L.} and Simon Kasif and Fleischmann, {Robert D.} and Jeremy Peterson and Owen White and Salzberg, {Steven L.}",

note = "Funding Information: Thanks to Edward Arnold for graphical assistance. S.L.S. is supported in part by the National Human Genome Research Institute at NIH under Grant no. K01-HG00022-1. S.L.S. and A.L.D. are supported in part by the National Science Foundation under Grant no. IRI-9530462 and S.K. is supported by NSF IRI-9529227. R.D.F. is supported in part by National Institute of Allergy and Infectious Diseases at NIH under Grant no. R01-AI40125-01.",

year = "1999",

month = jun,

day = "1",

doi = "10.1093/nar/27.11.2369",

language = "English (US)",

volume = "27",

pages = "2369--2376",

journal = "Nucleic acids research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "11",

}

TY - JOUR

T1 - Alignment of whole genomes

AU - Delcher, Arthur L.

AU - Kasif, Simon

AU - Fleischmann, Robert D.

AU - Peterson, Jeremy

AU - White, Owen

AU - Salzberg, Steven L.

N1 - Funding Information: Thanks to Edward Arnold for graphical assistance. S.L.S. is supported in part by the National Human Genome Research Institute at NIH under Grant no. K01-HG00022-1. S.L.S. and A.L.D. are supported in part by the National Science Foundation under Grant no. IRI-9530462 and S.K. is supported by NSF IRI-9529227. R.D.F. is supported in part by National Institute of Allergy and Infectious Diseases at NIH under Grant no. R01-AI40125-01.

PY - 1999/6/1

Y1 - 1999/6/1

N2 - A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.

AB - A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.

UR - http://www.scopus.com/inward/record.url?scp=0033153375&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033153375&partnerID=8YFLogxK

U2 - 10.1093/nar/27.11.2369

DO - 10.1093/nar/27.11.2369

M3 - Article

C2 - 10325427

AN - SCOPUS:0033153375

SN - 0305-1048

VL - 27

SP - 2369

EP - 2376

JO - Nucleic acids research

JF - Nucleic acids research

IS - 11

ER -

Alignment of whole genomes

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this