Hierarchical scaffolding with Bambus

Mihai Pop, Daniel S. Kosack, Steven L. Salzberg

Research output: Contribution to journalArticle

Abstract

The output of a genome assembler generally comprises a collection of contiguous DNA sequences (contigs) whose relative placement along the genome is not defined. A procedure called scaffolding is commonly used to order and orient these contigs using paired read information. This ordering of contigs is an essential step when finishing and analyzing the data from a whole-genome shotgun project. Most recent assemblers include a scaffolding module; however, users have little control over the scaffolding algorithm or the information produced. We thus developed a general-purpose scaffolder, called Bambus, which affords users significant flexibility in controlling the scaffolding parameters. Bambus was used recently to scaffold the low-coverage draft dog genome data. Most significantly, Bambus enables the use of linking data other than that inferred from mate-pair information. For example, the sequence of a completed genome can be used to guide the scaffolding of a related organism. We present several applications of Bambus: support for finishing, comparative genomics, analysis of the haplotype structure of genomes, and scaffolding of a mammalian genome at low coverage. Bambus is available as an open-source package from our Web site.

Original languageEnglish (US)
Pages (from-to)149-159
Number of pages11
JournalGenome research
Volume14
Issue number1
DOIs
StatePublished - Jan 1 2004

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint Dive into the research topics of 'Hierarchical scaffolding with Bambus'. Together they form a unique fingerprint.

  • Cite this