Insights into protein structure and function from disorder-complexity space

Edward A. Weathers, Michael Paulaitis, Thomas B Woolf, Jan H. Hoh

Research output: Contribution to journalArticle

Abstract

Intrinsically disordered proteins have a wide variety of important functional roles. However, the relationship between sequence and function in these proteins is significantly different than that for well-folded proteins. In a previous work, we showed that the propensity to be disordered can be recognized based on sequence composition alone. Here that analysis is furthered by examining the relationship of disorder propensity to sequence complexity, where the metrics for these two properties depend only on composition. The distributions of 40 amino acid peptides from both ordered, and disordered proteins are graphed in this disorder-complexity space. An analysis of Swiss-Prot shows that most peptides have high complexity and relatively low disorder. However, there are also an appreciable number of low complexity-high disorder peptides in the database. In contrast, there are no low complexity-low disorder peptides. A similar analysis for peptides in the PDB reveals a much narrower distribution, with few peptides of low complexity and high disorder. In this case, the bounds of the disorder-complexity distribution are well defined and might be used to evaluate the likelihood that a peptide can be crystallized with current methods. The disorder-complexity distributions of individual proteins and sets of proteins grouped by function are also examined. Among individual proteins, there is an enormous variety of distributions that in some cases can be rationalized with regard to function. Groups of functionally related proteins are found to have distributions that are similar within each group but show notable differences between groups. Finally, a pattern matching algorithm is used to search for proteins with particular disorder-complexity distributions. The results suggest that this approach might be used to identify relationships between otherwise dissimilar proteins.

Original languageEnglish (US)
Pages (from-to)16-28
Number of pages13
JournalProteins
Volume66
Issue number1
DOIs
StatePublished - Jan 1 2007

Fingerprint

Peptides
Proteins
Intrinsically Disordered Proteins
Pattern matching
Chemical analysis
Databases
Amino Acids

Keywords

  • Aggregation
  • Natively unfolded
  • Protein crystallization
  • Unstructured protein

ASJC Scopus subject areas

  • Genetics
  • Structural Biology
  • Biochemistry

Cite this

Insights into protein structure and function from disorder-complexity space. / Weathers, Edward A.; Paulaitis, Michael; Woolf, Thomas B; Hoh, Jan H.

In: Proteins, Vol. 66, No. 1, 01.01.2007, p. 16-28.

Research output: Contribution to journalArticle

@article{6830ff73417b4b2b89248be473294ec3,
title = "Insights into protein structure and function from disorder-complexity space",
abstract = "Intrinsically disordered proteins have a wide variety of important functional roles. However, the relationship between sequence and function in these proteins is significantly different than that for well-folded proteins. In a previous work, we showed that the propensity to be disordered can be recognized based on sequence composition alone. Here that analysis is furthered by examining the relationship of disorder propensity to sequence complexity, where the metrics for these two properties depend only on composition. The distributions of 40 amino acid peptides from both ordered, and disordered proteins are graphed in this disorder-complexity space. An analysis of Swiss-Prot shows that most peptides have high complexity and relatively low disorder. However, there are also an appreciable number of low complexity-high disorder peptides in the database. In contrast, there are no low complexity-low disorder peptides. A similar analysis for peptides in the PDB reveals a much narrower distribution, with few peptides of low complexity and high disorder. In this case, the bounds of the disorder-complexity distribution are well defined and might be used to evaluate the likelihood that a peptide can be crystallized with current methods. The disorder-complexity distributions of individual proteins and sets of proteins grouped by function are also examined. Among individual proteins, there is an enormous variety of distributions that in some cases can be rationalized with regard to function. Groups of functionally related proteins are found to have distributions that are similar within each group but show notable differences between groups. Finally, a pattern matching algorithm is used to search for proteins with particular disorder-complexity distributions. The results suggest that this approach might be used to identify relationships between otherwise dissimilar proteins.",
keywords = "Aggregation, Natively unfolded, Protein crystallization, Unstructured protein",
author = "Weathers, {Edward A.} and Michael Paulaitis and Woolf, {Thomas B} and Hoh, {Jan H.}",
year = "2007",
month = "1",
day = "1",
doi = "10.1002/prot.21055",
language = "English (US)",
volume = "66",
pages = "16--28",
journal = "Proteins: Structure, Function and Genetics",
issn = "0887-3585",
publisher = "Wiley-Liss Inc.",
number = "1",

}

TY - JOUR

T1 - Insights into protein structure and function from disorder-complexity space

AU - Weathers, Edward A.

AU - Paulaitis, Michael

AU - Woolf, Thomas B

AU - Hoh, Jan H.

PY - 2007/1/1

Y1 - 2007/1/1

N2 - Intrinsically disordered proteins have a wide variety of important functional roles. However, the relationship between sequence and function in these proteins is significantly different than that for well-folded proteins. In a previous work, we showed that the propensity to be disordered can be recognized based on sequence composition alone. Here that analysis is furthered by examining the relationship of disorder propensity to sequence complexity, where the metrics for these two properties depend only on composition. The distributions of 40 amino acid peptides from both ordered, and disordered proteins are graphed in this disorder-complexity space. An analysis of Swiss-Prot shows that most peptides have high complexity and relatively low disorder. However, there are also an appreciable number of low complexity-high disorder peptides in the database. In contrast, there are no low complexity-low disorder peptides. A similar analysis for peptides in the PDB reveals a much narrower distribution, with few peptides of low complexity and high disorder. In this case, the bounds of the disorder-complexity distribution are well defined and might be used to evaluate the likelihood that a peptide can be crystallized with current methods. The disorder-complexity distributions of individual proteins and sets of proteins grouped by function are also examined. Among individual proteins, there is an enormous variety of distributions that in some cases can be rationalized with regard to function. Groups of functionally related proteins are found to have distributions that are similar within each group but show notable differences between groups. Finally, a pattern matching algorithm is used to search for proteins with particular disorder-complexity distributions. The results suggest that this approach might be used to identify relationships between otherwise dissimilar proteins.

AB - Intrinsically disordered proteins have a wide variety of important functional roles. However, the relationship between sequence and function in these proteins is significantly different than that for well-folded proteins. In a previous work, we showed that the propensity to be disordered can be recognized based on sequence composition alone. Here that analysis is furthered by examining the relationship of disorder propensity to sequence complexity, where the metrics for these two properties depend only on composition. The distributions of 40 amino acid peptides from both ordered, and disordered proteins are graphed in this disorder-complexity space. An analysis of Swiss-Prot shows that most peptides have high complexity and relatively low disorder. However, there are also an appreciable number of low complexity-high disorder peptides in the database. In contrast, there are no low complexity-low disorder peptides. A similar analysis for peptides in the PDB reveals a much narrower distribution, with few peptides of low complexity and high disorder. In this case, the bounds of the disorder-complexity distribution are well defined and might be used to evaluate the likelihood that a peptide can be crystallized with current methods. The disorder-complexity distributions of individual proteins and sets of proteins grouped by function are also examined. Among individual proteins, there is an enormous variety of distributions that in some cases can be rationalized with regard to function. Groups of functionally related proteins are found to have distributions that are similar within each group but show notable differences between groups. Finally, a pattern matching algorithm is used to search for proteins with particular disorder-complexity distributions. The results suggest that this approach might be used to identify relationships between otherwise dissimilar proteins.

KW - Aggregation

KW - Natively unfolded

KW - Protein crystallization

KW - Unstructured protein

UR - http://www.scopus.com/inward/record.url?scp=33845682458&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33845682458&partnerID=8YFLogxK

U2 - 10.1002/prot.21055

DO - 10.1002/prot.21055

M3 - Article

C2 - 17044059

AN - SCOPUS:33845682458

VL - 66

SP - 16

EP - 28

JO - Proteins: Structure, Function and Genetics

JF - Proteins: Structure, Function and Genetics

SN - 0887-3585

IS - 1

ER -