Gene expression data analysis with a dynamically extended self-organized map that exploits class information

Seferina Mavroudi, Stergios Papadimitriou, Anastasios Bezerianos

Research output: Contribution to journalArticle

Abstract

Motivation: Currently the most popular approach to analyze genome-wide expression data is clustering. One of the major drawbacks of most of the existing clustering methods is that the number of clusters has to be specified a priori. Furthermore, by using pure unsupervised algorithms prior biological knowledge is totally ignored Moreover, most current tools lack an effective framework for tight integration of unsupervised and supervised learning for the analysis of high-dimensional expression data and only very few multi-class supervised approaches are designed with the provision for effectively utilizing multiple functional class labeling. Results: The paper adapts a novel Self-Organizing map called supervised Network Self-Organized Map (sNet-SOM) to the peculiarities of multi-labeled gene expression data. The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. Nodes within a rectangular grid are grown at the boundary nodes, weights rippled from the internal nodes towards the outer nodes of the grid, and whole columns inserted within the map The appropriate level of expansion is determined automatically. Multiple sNet-SOM models are constructed dynamically each for a different unsupervised/supervised balance and model selection criteria are used to select the one optimum one. The results indicate that sNet-SOM yields competitive performance to other recently proposed approaches for supervised classification at a significantly reduced computational cost and it provides extensive exploratory analysis potentiality within the analysis framework. Furthermore, it explores simple design decisions that are easier to comprehend and computationally efficient.

Original languageEnglish (US)
Pages (from-to)1446-1453
Number of pages8
JournalBioinformatics
Volume18
Issue number11
StatePublished - Nov 1 2002
Externally publishedYes

Fingerprint

Gene Expression Data
Gene expression
Cluster Analysis
Data analysis
Gene Expression
Patient Selection
Number of Clusters
Vertex of a graph
Learning
Genome
Weights and Measures
Costs and Cost Analysis
Grid
Exploratory Analysis
Model Selection Criteria
Unsupervised learning
Model Complexity
Supervised Classification
Unsupervised Learning
Supervised learning

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computational Theory and Mathematics
  • Computer Science Applications

Cite this

Gene expression data analysis with a dynamically extended self-organized map that exploits class information. / Mavroudi, Seferina; Papadimitriou, Stergios; Bezerianos, Anastasios.

In: Bioinformatics, Vol. 18, No. 11, 01.11.2002, p. 1446-1453.

Research output: Contribution to journalArticle

Mavroudi, S, Papadimitriou, S & Bezerianos, A 2002, 'Gene expression data analysis with a dynamically extended self-organized map that exploits class information', Bioinformatics, vol. 18, no. 11, pp. 1446-1453.
Mavroudi, Seferina ; Papadimitriou, Stergios ; Bezerianos, Anastasios. / Gene expression data analysis with a dynamically extended self-organized map that exploits class information. In: Bioinformatics. 2002 ; Vol. 18, No. 11. pp. 1446-1453.
@article{641760cbd78648ce99e560ff9b840743,
title = "Gene expression data analysis with a dynamically extended self-organized map that exploits class information",
abstract = "Motivation: Currently the most popular approach to analyze genome-wide expression data is clustering. One of the major drawbacks of most of the existing clustering methods is that the number of clusters has to be specified a priori. Furthermore, by using pure unsupervised algorithms prior biological knowledge is totally ignored Moreover, most current tools lack an effective framework for tight integration of unsupervised and supervised learning for the analysis of high-dimensional expression data and only very few multi-class supervised approaches are designed with the provision for effectively utilizing multiple functional class labeling. Results: The paper adapts a novel Self-Organizing map called supervised Network Self-Organized Map (sNet-SOM) to the peculiarities of multi-labeled gene expression data. The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. Nodes within a rectangular grid are grown at the boundary nodes, weights rippled from the internal nodes towards the outer nodes of the grid, and whole columns inserted within the map The appropriate level of expansion is determined automatically. Multiple sNet-SOM models are constructed dynamically each for a different unsupervised/supervised balance and model selection criteria are used to select the one optimum one. The results indicate that sNet-SOM yields competitive performance to other recently proposed approaches for supervised classification at a significantly reduced computational cost and it provides extensive exploratory analysis potentiality within the analysis framework. Furthermore, it explores simple design decisions that are easier to comprehend and computationally efficient.",
author = "Seferina Mavroudi and Stergios Papadimitriou and Anastasios Bezerianos",
year = "2002",
month = "11",
day = "1",
language = "English (US)",
volume = "18",
pages = "1446--1453",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "11",

}

TY - JOUR

T1 - Gene expression data analysis with a dynamically extended self-organized map that exploits class information

AU - Mavroudi, Seferina

AU - Papadimitriou, Stergios

AU - Bezerianos, Anastasios

PY - 2002/11/1

Y1 - 2002/11/1

N2 - Motivation: Currently the most popular approach to analyze genome-wide expression data is clustering. One of the major drawbacks of most of the existing clustering methods is that the number of clusters has to be specified a priori. Furthermore, by using pure unsupervised algorithms prior biological knowledge is totally ignored Moreover, most current tools lack an effective framework for tight integration of unsupervised and supervised learning for the analysis of high-dimensional expression data and only very few multi-class supervised approaches are designed with the provision for effectively utilizing multiple functional class labeling. Results: The paper adapts a novel Self-Organizing map called supervised Network Self-Organized Map (sNet-SOM) to the peculiarities of multi-labeled gene expression data. The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. Nodes within a rectangular grid are grown at the boundary nodes, weights rippled from the internal nodes towards the outer nodes of the grid, and whole columns inserted within the map The appropriate level of expansion is determined automatically. Multiple sNet-SOM models are constructed dynamically each for a different unsupervised/supervised balance and model selection criteria are used to select the one optimum one. The results indicate that sNet-SOM yields competitive performance to other recently proposed approaches for supervised classification at a significantly reduced computational cost and it provides extensive exploratory analysis potentiality within the analysis framework. Furthermore, it explores simple design decisions that are easier to comprehend and computationally efficient.

AB - Motivation: Currently the most popular approach to analyze genome-wide expression data is clustering. One of the major drawbacks of most of the existing clustering methods is that the number of clusters has to be specified a priori. Furthermore, by using pure unsupervised algorithms prior biological knowledge is totally ignored Moreover, most current tools lack an effective framework for tight integration of unsupervised and supervised learning for the analysis of high-dimensional expression data and only very few multi-class supervised approaches are designed with the provision for effectively utilizing multiple functional class labeling. Results: The paper adapts a novel Self-Organizing map called supervised Network Self-Organized Map (sNet-SOM) to the peculiarities of multi-labeled gene expression data. The sNet-SOM determines adaptively the number of clusters with a dynamic extension process. This process is driven by an inhomogeneous measure that tries to balance unsupervised, supervised and model complexity criteria. Nodes within a rectangular grid are grown at the boundary nodes, weights rippled from the internal nodes towards the outer nodes of the grid, and whole columns inserted within the map The appropriate level of expansion is determined automatically. Multiple sNet-SOM models are constructed dynamically each for a different unsupervised/supervised balance and model selection criteria are used to select the one optimum one. The results indicate that sNet-SOM yields competitive performance to other recently proposed approaches for supervised classification at a significantly reduced computational cost and it provides extensive exploratory analysis potentiality within the analysis framework. Furthermore, it explores simple design decisions that are easier to comprehend and computationally efficient.

UR - http://www.scopus.com/inward/record.url?scp=0036855422&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036855422&partnerID=8YFLogxK

M3 - Article

VL - 18

SP - 1446

EP - 1453

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 11

ER -