kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

Christopher Fletez-Brant, Dongwon Lee, Andrew S. McCallion, Michael A. Beer

Research output: Contribution to journalArticlepeer-review

74 Scopus citations

Abstract

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.

Original languageEnglish (US)
Pages (from-to)W544-556
JournalUnknown Journal
Volume41
Issue numberWeb Server issue
DOIs
StatePublished - Jul 2013

ASJC Scopus subject areas

  • Genetics

Fingerprint

Dive into the research topics of 'kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.'. Together they form a unique fingerprint.

Cite this