The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.

Nicholas M. Luscombe; Jiang Qian; Zhaolei Zhang; Ted Johnson; Mark Gerstein

The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.

Nicholas M. Luscombe, Jiang Qian, Zhaolei Zhang, Ted Johnson, Mark Gerstein

Research output: Contribution to journal › Article › peer-review

Abstract

BACKGROUND: The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli. RESULTS: Despite these differences, we find that the genomic occurrence of generalized parts follows a well-known mathematical framework called the power law, with a few parts occurring many times and most occurring only a few times. This observation is true in a wide variety of genomic contexts. Earlier studies found power laws in a few specific cases, such as the occurrence of protein families. Here, we find many further cases of power-law behavior, for example in the occurrence of pseudogenes and in levels of gene expression. We show comprehensively that this behavior applies across many different genomes, for many different types of parts (DNA words, InterPro families, protein superfamilies and folds, pseudogene families and pseudomotifs), and for the many disparate attributes associated with these parts (their functions, interactions and expression levels). CONCLUSIONS: Power-law behavior provides a concise mathematical description of an important biological feature: the sheer dominance of a few members over the overall population. We present this behavior in a unified framework and propose that all these observations are connected to an underlying DNA duplication process as genomes evolved to their current state.

Original language	English (US)
Pages (from-to)	RESEARCH0040
Journal	Genome biology
Volume	3
Issue number	8
State	Published - Jul 25 2002
Externally published	Yes

ASJC Scopus subject areas

Ecology, Evolution, Behavior and Systematics
Genetics
Cell Biology

Cite this

@article{f156d2d4a41e4b5a8ef87164f6017bce,

title = "The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.",

abstract = "BACKGROUND: The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli. RESULTS: Despite these differences, we find that the genomic occurrence of generalized parts follows a well-known mathematical framework called the power law, with a few parts occurring many times and most occurring only a few times. This observation is true in a wide variety of genomic contexts. Earlier studies found power laws in a few specific cases, such as the occurrence of protein families. Here, we find many further cases of power-law behavior, for example in the occurrence of pseudogenes and in levels of gene expression. We show comprehensively that this behavior applies across many different genomes, for many different types of parts (DNA words, InterPro families, protein superfamilies and folds, pseudogene families and pseudomotifs), and for the many disparate attributes associated with these parts (their functions, interactions and expression levels). CONCLUSIONS: Power-law behavior provides a concise mathematical description of an important biological feature: the sheer dominance of a few members over the overall population. We present this behavior in a unified framework and propose that all these observations are connected to an underlying DNA duplication process as genomes evolved to their current state.",

author = "Luscombe, {Nicholas M.} and Jiang Qian and Zhaolei Zhang and Ted Johnson and Mark Gerstein",

year = "2002",

month = jul,

day = "25",

language = "English (US)",

volume = "3",

pages = "RESEARCH0040",

journal = "Genome biology",

issn = "1474-7596",

publisher = "BioMed Central",

number = "8",

}

TY - JOUR

T1 - The dominance of the population by a selected few

T2 - power-law behaviour applies to a wide variety of genomic properties.

AU - Luscombe, Nicholas M.

AU - Qian, Jiang

AU - Zhang, Zhaolei

AU - Johnson, Ted

AU - Gerstein, Mark

PY - 2002/7/25

Y1 - 2002/7/25

N2 - BACKGROUND: The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli. RESULTS: Despite these differences, we find that the genomic occurrence of generalized parts follows a well-known mathematical framework called the power law, with a few parts occurring many times and most occurring only a few times. This observation is true in a wide variety of genomic contexts. Earlier studies found power laws in a few specific cases, such as the occurrence of protein families. Here, we find many further cases of power-law behavior, for example in the occurrence of pseudogenes and in levels of gene expression. We show comprehensively that this behavior applies across many different genomes, for many different types of parts (DNA words, InterPro families, protein superfamilies and folds, pseudogene families and pseudomotifs), and for the many disparate attributes associated with these parts (their functions, interactions and expression levels). CONCLUSIONS: Power-law behavior provides a concise mathematical description of an important biological feature: the sheer dominance of a few members over the overall population. We present this behavior in a unified framework and propose that all these observations are connected to an underlying DNA duplication process as genomes evolved to their current state.

AB - BACKGROUND: The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli. RESULTS: Despite these differences, we find that the genomic occurrence of generalized parts follows a well-known mathematical framework called the power law, with a few parts occurring many times and most occurring only a few times. This observation is true in a wide variety of genomic contexts. Earlier studies found power laws in a few specific cases, such as the occurrence of protein families. Here, we find many further cases of power-law behavior, for example in the occurrence of pseudogenes and in levels of gene expression. We show comprehensively that this behavior applies across many different genomes, for many different types of parts (DNA words, InterPro families, protein superfamilies and folds, pseudogene families and pseudomotifs), and for the many disparate attributes associated with these parts (their functions, interactions and expression levels). CONCLUSIONS: Power-law behavior provides a concise mathematical description of an important biological feature: the sheer dominance of a few members over the overall population. We present this behavior in a unified framework and propose that all these observations are connected to an underlying DNA duplication process as genomes evolved to their current state.

UR - http://www.scopus.com/inward/record.url?scp=0242713700&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0242713700&partnerID=8YFLogxK

M3 - Article

C2 - 12186647

AN - SCOPUS:0242713700

SN - 1474-7596

VL - 3

SP - RESEARCH0040

JO - Genome biology

JF - Genome biology

IS - 8

ER -

The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this