An algorithm for identification of regulatory signals in unaligned DNA sequences, its testing and parallel implementation

Liudmila V Danilova, Vassily A. Lyubetsky, Mikhail S. Gelfand

Research output: Contribution to journalArticle

Abstract

We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coll and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.

Original languageEnglish (US)
Pages (from-to)33-47
Number of pages15
JournalIn Silico Biology
Volume3
Issue number1-2
StatePublished - 2003
Externally publishedYes

Fingerprint

DNA sequences
Parallel Implementation
DNA Sequence
Genes
Escherichia
Bacterial Genes
Testing
Nucleic Acid Repetitive Sequences
Gene
Genome
Mutation
Strictly
Linearly

Keywords

  • Bacterial genomes
  • Bioinformatics
  • MPI protocol
  • Orthologous genes
  • Parallel computing
  • Regulatory signals

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Cite this

An algorithm for identification of regulatory signals in unaligned DNA sequences, its testing and parallel implementation. / Danilova, Liudmila V; Lyubetsky, Vassily A.; Gelfand, Mikhail S.

In: In Silico Biology, Vol. 3, No. 1-2, 2003, p. 33-47.

Research output: Contribution to journalArticle

@article{84991d4e1c8d4cd280e2d16d2da400d1,
title = "An algorithm for identification of regulatory signals in unaligned DNA sequences, its testing and parallel implementation",
abstract = "We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coll and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.",
keywords = "Bacterial genomes, Bioinformatics, MPI protocol, Orthologous genes, Parallel computing, Regulatory signals",
author = "Danilova, {Liudmila V} and Lyubetsky, {Vassily A.} and Gelfand, {Mikhail S.}",
year = "2003",
language = "English (US)",
volume = "3",
pages = "33--47",
journal = "In Silico Biology",
issn = "1386-6338",
publisher = "IOS Press",
number = "1-2",

}

TY - JOUR

T1 - An algorithm for identification of regulatory signals in unaligned DNA sequences, its testing and parallel implementation

AU - Danilova, Liudmila V

AU - Lyubetsky, Vassily A.

AU - Gelfand, Mikhail S.

PY - 2003

Y1 - 2003

N2 - We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coll and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.

AB - We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coll and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.

KW - Bacterial genomes

KW - Bioinformatics

KW - MPI protocol

KW - Orthologous genes

KW - Parallel computing

KW - Regulatory signals

UR - http://www.scopus.com/inward/record.url?scp=0042388309&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0042388309&partnerID=8YFLogxK

M3 - Article

C2 - 12762844

AN - SCOPUS:0042388309

VL - 3

SP - 33

EP - 47

JO - In Silico Biology

JF - In Silico Biology

SN - 1386-6338

IS - 1-2

ER -