TY - JOUR
T1 - Alignment of whole genomes
AU - Delcher, Arthur L.
AU - Kasif, Simon
AU - Fleischmann, Robert D.
AU - Peterson, Jeremy
AU - White, Owen
AU - Salzberg, Steven L.
N1 - Funding Information:
Thanks to Edward Arnold for graphical assistance. S.L.S. is supported in part by the National Human Genome Research Institute at NIH under Grant no. K01-HG00022-1. S.L.S. and A.L.D. are supported in part by the National Science Foundation under Grant no. IRI-9530462 and S.K. is supported by NSF IRI-9529227. R.D.F. is supported in part by National Institute of Allergy and Infectious Diseases at NIH under Grant no. R01-AI40125-01.
PY - 1999/6/1
Y1 - 1999/6/1
N2 - A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.
AB - A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.
UR - http://www.scopus.com/inward/record.url?scp=0033153375&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033153375&partnerID=8YFLogxK
U2 - 10.1093/nar/27.11.2369
DO - 10.1093/nar/27.11.2369
M3 - Article
C2 - 10325427
AN - SCOPUS:0033153375
SN - 0305-1048
VL - 27
SP - 2369
EP - 2376
JO - Nucleic acids research
JF - Nucleic acids research
IS - 11
ER -