Estimation and assessment of raw copy numbers at the single locus level

H. Bengtsson, R. Irizarry, B. Carvalho, T. P. Speed

Research output: Contribution to journalArticle

Abstract

Motivation: Although copy-number aberrations are known to contribute to the diversity of the human DNA and cause various diseases, many aberrations and their phenotypes are still to be explored. The recent development of single-nucleotide polymorphism (SNP) arrays provides researchers with tools for calling genotypes and identifying chromosomal aberrations at an order-of-magnitude greater resolution than possible a few years ago. The fundamental problem in array-based copy-number (CN) analysis is to obtain CN estimates at a single-locus resolution with high accuracy and precision such that downstream segmentation methods are more likely to succeed. Results: We propose a preprocessing method for estimating raw CNs from Affymetrix SNP arrays. Its core utilizes a multichip probe-level model analogous to that for high-density oligonucleotide expression arrays. We extend this model by adding an adjustment for sequence-specific allelic imbalances such as cross-hybridization between allele A and allele B probes. We focus on total CN estimates, which allows us to further constrain the probe-level model to increase the signal-to-noise ratio of CN estimates. Further improvement is obtained by controlling for PCR effects. Each part of the model is fitted robustly. The performance is assessed by quantifying how well raw CNs alone differentiate between one and two copies on Chromosome X (ChrX) at a single-locus resolution (27kb) up to a 200kb resolution. The evaluation is done with publicly available HapMap data.

Original languageEnglish (US)
Pages (from-to)759-767
Number of pages9
JournalBioinformatics
Volume24
Issue number6
DOIs
StatePublished - Mar 2008

Fingerprint

Single Nucleotide Polymorphism
Locus
Aberrations
Alleles
Allelic Imbalance
HapMap Project
Aberration
X Chromosome
Signal-To-Noise Ratio
Nucleotides
Oligonucleotide Array Sequence Analysis
Polymorphism
Chromosome Aberrations
Probe
Single nucleotide Polymorphism
Genotype
Research Personnel
Phenotype
Polymerase Chain Reaction
Oligonucleotides

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

Estimation and assessment of raw copy numbers at the single locus level. / Bengtsson, H.; Irizarry, R.; Carvalho, B.; Speed, T. P.

In: Bioinformatics, Vol. 24, No. 6, 03.2008, p. 759-767.

Research output: Contribution to journalArticle

Bengtsson, H, Irizarry, R, Carvalho, B & Speed, TP 2008, 'Estimation and assessment of raw copy numbers at the single locus level', Bioinformatics, vol. 24, no. 6, pp. 759-767. https://doi.org/10.1093/bioinformatics/btn016
Bengtsson, H. ; Irizarry, R. ; Carvalho, B. ; Speed, T. P. / Estimation and assessment of raw copy numbers at the single locus level. In: Bioinformatics. 2008 ; Vol. 24, No. 6. pp. 759-767.
@article{990ae5bcaee74c0a8b87e36d8dcdf85f,
title = "Estimation and assessment of raw copy numbers at the single locus level",
abstract = "Motivation: Although copy-number aberrations are known to contribute to the diversity of the human DNA and cause various diseases, many aberrations and their phenotypes are still to be explored. The recent development of single-nucleotide polymorphism (SNP) arrays provides researchers with tools for calling genotypes and identifying chromosomal aberrations at an order-of-magnitude greater resolution than possible a few years ago. The fundamental problem in array-based copy-number (CN) analysis is to obtain CN estimates at a single-locus resolution with high accuracy and precision such that downstream segmentation methods are more likely to succeed. Results: We propose a preprocessing method for estimating raw CNs from Affymetrix SNP arrays. Its core utilizes a multichip probe-level model analogous to that for high-density oligonucleotide expression arrays. We extend this model by adding an adjustment for sequence-specific allelic imbalances such as cross-hybridization between allele A and allele B probes. We focus on total CN estimates, which allows us to further constrain the probe-level model to increase the signal-to-noise ratio of CN estimates. Further improvement is obtained by controlling for PCR effects. Each part of the model is fitted robustly. The performance is assessed by quantifying how well raw CNs alone differentiate between one and two copies on Chromosome X (ChrX) at a single-locus resolution (27kb) up to a 200kb resolution. The evaluation is done with publicly available HapMap data.",
author = "H. Bengtsson and R. Irizarry and B. Carvalho and Speed, {T. P.}",
year = "2008",
month = "3",
doi = "10.1093/bioinformatics/btn016",
language = "English (US)",
volume = "24",
pages = "759--767",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Estimation and assessment of raw copy numbers at the single locus level

AU - Bengtsson, H.

AU - Irizarry, R.

AU - Carvalho, B.

AU - Speed, T. P.

PY - 2008/3

Y1 - 2008/3

N2 - Motivation: Although copy-number aberrations are known to contribute to the diversity of the human DNA and cause various diseases, many aberrations and their phenotypes are still to be explored. The recent development of single-nucleotide polymorphism (SNP) arrays provides researchers with tools for calling genotypes and identifying chromosomal aberrations at an order-of-magnitude greater resolution than possible a few years ago. The fundamental problem in array-based copy-number (CN) analysis is to obtain CN estimates at a single-locus resolution with high accuracy and precision such that downstream segmentation methods are more likely to succeed. Results: We propose a preprocessing method for estimating raw CNs from Affymetrix SNP arrays. Its core utilizes a multichip probe-level model analogous to that for high-density oligonucleotide expression arrays. We extend this model by adding an adjustment for sequence-specific allelic imbalances such as cross-hybridization between allele A and allele B probes. We focus on total CN estimates, which allows us to further constrain the probe-level model to increase the signal-to-noise ratio of CN estimates. Further improvement is obtained by controlling for PCR effects. Each part of the model is fitted robustly. The performance is assessed by quantifying how well raw CNs alone differentiate between one and two copies on Chromosome X (ChrX) at a single-locus resolution (27kb) up to a 200kb resolution. The evaluation is done with publicly available HapMap data.

AB - Motivation: Although copy-number aberrations are known to contribute to the diversity of the human DNA and cause various diseases, many aberrations and their phenotypes are still to be explored. The recent development of single-nucleotide polymorphism (SNP) arrays provides researchers with tools for calling genotypes and identifying chromosomal aberrations at an order-of-magnitude greater resolution than possible a few years ago. The fundamental problem in array-based copy-number (CN) analysis is to obtain CN estimates at a single-locus resolution with high accuracy and precision such that downstream segmentation methods are more likely to succeed. Results: We propose a preprocessing method for estimating raw CNs from Affymetrix SNP arrays. Its core utilizes a multichip probe-level model analogous to that for high-density oligonucleotide expression arrays. We extend this model by adding an adjustment for sequence-specific allelic imbalances such as cross-hybridization between allele A and allele B probes. We focus on total CN estimates, which allows us to further constrain the probe-level model to increase the signal-to-noise ratio of CN estimates. Further improvement is obtained by controlling for PCR effects. Each part of the model is fitted robustly. The performance is assessed by quantifying how well raw CNs alone differentiate between one and two copies on Chromosome X (ChrX) at a single-locus resolution (27kb) up to a 200kb resolution. The evaluation is done with publicly available HapMap data.

UR - http://www.scopus.com/inward/record.url?scp=40749162839&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=40749162839&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btn016

DO - 10.1093/bioinformatics/btn016

M3 - Article

C2 - 18204055

AN - SCOPUS:40749162839

VL - 24

SP - 759

EP - 767

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -