A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

B. M. Bolstad; R. A. Irizarry; M. Åstrand; T. P. Speed

doi:10.1093/bioinformatics/19.2.185

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

B. M. Bolstad, R. A. Irizarry, M. Åstrand, T. P. Speed

Research output: Contribution to journal › Article › peer-review

6026 Scopus citations

Abstract

Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.

Original language	English (US)
Pages (from-to)	185-193
Number of pages	9
Journal	Bioinformatics
Volume	19
Issue number	2
DOIs	https://doi.org/10.1093/bioinformatics/19.2.185
State	Published - Feb 1 2003
Externally published	Yes

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/19.2.185

Cite this

@article{e137ae4f9ab04c1ea365fa226f289aa8,

title = "A comparison of normalization methods for high density oligonucleotide array data based on variance and bias",

abstract = "Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.",

author = "Bolstad, {B. M.} and Irizarry, {R. A.} and M. {\AA}strand and Speed, {T. P.}",

year = "2003",

month = feb,

day = "1",

doi = "10.1093/bioinformatics/19.2.185",

language = "English (US)",

volume = "19",

pages = "185--193",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "2",

}

TY - JOUR

T1 - A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

AU - Bolstad, B. M.

AU - Irizarry, R. A.

AU - Åstrand, M.

AU - Speed, T. P.

PY - 2003/2/1

Y1 - 2003/2/1

N2 - Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.

AB - Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.

UR - http://www.scopus.com/inward/record.url?scp=0037316303&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037316303&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/19.2.185

DO - 10.1093/bioinformatics/19.2.185

M3 - Article

C2 - 12538238

AN - SCOPUS:0037316303

SN - 1367-4803

VL - 19

SP - 185

EP - 193

JO - Bioinformatics

JF - Bioinformatics

IS - 2

ER -

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this