Effective cluster-based seed design for cross-species sequence comparisons

Leming Zhou, Ingrid Mihai, Liliana D Florea

Research output: Contribution to journalArticle

Abstract

To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.

Original languageEnglish (US)
Pages (from-to)2926-2927
Number of pages2
JournalBioinformatics
Volume24
Issue number24
DOIs
StatePublished - Dec 2008
Externally publishedYes

Fingerprint

Sequence Comparison
Seed
Seeds
Genes
Genome
Pairwise Comparisons
Alignment
Gene
Design
Alternatives

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Effective cluster-based seed design for cross-species sequence comparisons. / Zhou, Leming; Mihai, Ingrid; Florea, Liliana D.

In: Bioinformatics, Vol. 24, No. 24, 12.2008, p. 2926-2927.

Research output: Contribution to journalArticle

Zhou, Leming ; Mihai, Ingrid ; Florea, Liliana D. / Effective cluster-based seed design for cross-species sequence comparisons. In: Bioinformatics. 2008 ; Vol. 24, No. 24. pp. 2926-2927.
@article{8450184e2fab4f609b9ab14e0ef1d79d,
title = "Effective cluster-based seed design for cross-species sequence comparisons",
abstract = "To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.",
author = "Leming Zhou and Ingrid Mihai and Florea, {Liliana D}",
year = "2008",
month = "12",
doi = "10.1093/bioinformatics/btn547",
language = "English (US)",
volume = "24",
pages = "2926--2927",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "24",

}

TY - JOUR

T1 - Effective cluster-based seed design for cross-species sequence comparisons

AU - Zhou, Leming

AU - Mihai, Ingrid

AU - Florea, Liliana D

PY - 2008/12

Y1 - 2008/12

N2 - To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.

AB - To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.

UR - http://www.scopus.com/inward/record.url?scp=57249083964&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57249083964&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn547

DO - 10.1093/bioinformatics/btn547

M3 - Article

C2 - 18940827

AN - SCOPUS:57249083964

VL - 24

SP - 2926

EP - 2927

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 24

ER -