SOM-based class discovery exploring the ICA-reduced features of microarray expression profiles

Andrei Dragomir; Seferina Mavroudi; Anastasios Bezerianos

doi:10.1002/cfg.444

SOM-based class discovery exploring the ICA-reduced features of microarray expression profiles

Andrei Dragomir, Seferina Mavroudi, Anastasios Bezerianos

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Gene expression datasets are large and complex, having many variables and unknown internal structure. We apply independent component analysis (ICA) to derive a less redundant representation of the expression data. The decomposition produces components with minimal statistical dependence and reveals biologically relevant information. Consequently, to the transformed data, we apply cluster analysis (an important and popular analysis tool for obtaining an initial understanding of the data, usually employed for class discovery). The proposed self-organizing map (SOM)-based clustering algorithm automatically determines the number of 'natural' subgroups of the data, being aided at this task by the available prior knowledge of the functional categories of genes. An entropy criterion allows each gene to be assigned to multiple classes, which is closer to the biological representation. These features, however, are not achieved at the cost of the simplicity of the algorithm, since the map grows on a simple grid structure and the learning algorithm remains equal to Kohonen's one.

Original language	English (US)
Pages (from-to)	596-616
Number of pages	21
Journal	Comparative and Functional Genomics
Volume	5
Issue number	8
DOIs	https://doi.org/10.1002/cfg.444
State	Published - Dec 2004
Externally published	Yes

Keywords

Class discovery
Clustering
Independent component analysis
Microarrays
Self-organizing maps

ASJC Scopus subject areas

Biotechnology
Molecular Biology
Genetics

Access to Document

10.1002/cfg.444

Cite this

@article{ff416c1a6be04559875be8e52e8e3d8f,

title = "SOM-based class discovery exploring the ICA-reduced features of microarray expression profiles",

abstract = "Gene expression datasets are large and complex, having many variables and unknown internal structure. We apply independent component analysis (ICA) to derive a less redundant representation of the expression data. The decomposition produces components with minimal statistical dependence and reveals biologically relevant information. Consequently, to the transformed data, we apply cluster analysis (an important and popular analysis tool for obtaining an initial understanding of the data, usually employed for class discovery). The proposed self-organizing map (SOM)-based clustering algorithm automatically determines the number of 'natural' subgroups of the data, being aided at this task by the available prior knowledge of the functional categories of genes. An entropy criterion allows each gene to be assigned to multiple classes, which is closer to the biological representation. These features, however, are not achieved at the cost of the simplicity of the algorithm, since the map grows on a simple grid structure and the learning algorithm remains equal to Kohonen's one.",

keywords = "Class discovery, Clustering, Independent component analysis, Microarrays, Self-organizing maps",

author = "Andrei Dragomir and Seferina Mavroudi and Anastasios Bezerianos",

year = "2004",

month = dec,

doi = "10.1002/cfg.444",

language = "English (US)",

volume = "5",

pages = "596--616",

journal = "Comparative and Functional Genomics",

issn = "1531-6912",

publisher = "Hindawi Publishing Corporation",

number = "8",

}

TY - JOUR

T1 - SOM-based class discovery exploring the ICA-reduced features of microarray expression profiles

AU - Dragomir, Andrei

AU - Mavroudi, Seferina

AU - Bezerianos, Anastasios

PY - 2004/12

Y1 - 2004/12

N2 - Gene expression datasets are large and complex, having many variables and unknown internal structure. We apply independent component analysis (ICA) to derive a less redundant representation of the expression data. The decomposition produces components with minimal statistical dependence and reveals biologically relevant information. Consequently, to the transformed data, we apply cluster analysis (an important and popular analysis tool for obtaining an initial understanding of the data, usually employed for class discovery). The proposed self-organizing map (SOM)-based clustering algorithm automatically determines the number of 'natural' subgroups of the data, being aided at this task by the available prior knowledge of the functional categories of genes. An entropy criterion allows each gene to be assigned to multiple classes, which is closer to the biological representation. These features, however, are not achieved at the cost of the simplicity of the algorithm, since the map grows on a simple grid structure and the learning algorithm remains equal to Kohonen's one.

AB - Gene expression datasets are large and complex, having many variables and unknown internal structure. We apply independent component analysis (ICA) to derive a less redundant representation of the expression data. The decomposition produces components with minimal statistical dependence and reveals biologically relevant information. Consequently, to the transformed data, we apply cluster analysis (an important and popular analysis tool for obtaining an initial understanding of the data, usually employed for class discovery). The proposed self-organizing map (SOM)-based clustering algorithm automatically determines the number of 'natural' subgroups of the data, being aided at this task by the available prior knowledge of the functional categories of genes. An entropy criterion allows each gene to be assigned to multiple classes, which is closer to the biological representation. These features, however, are not achieved at the cost of the simplicity of the algorithm, since the map grows on a simple grid structure and the learning algorithm remains equal to Kohonen's one.

KW - Class discovery

KW - Clustering

KW - Independent component analysis

KW - Microarrays

KW - Self-organizing maps

UR - http://www.scopus.com/inward/record.url?scp=15544390257&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=15544390257&partnerID=8YFLogxK

U2 - 10.1002/cfg.444

DO - 10.1002/cfg.444

M3 - Article

C2 - 18629176

AN - SCOPUS:15544390257

SN - 1531-6912

VL - 5

SP - 596

EP - 616

JO - Comparative and Functional Genomics

JF - Comparative and Functional Genomics

IS - 8

ER -

SOM-based class discovery exploring the ICA-reduced features of microarray expression profiles

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this