Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins

Kim T. Simons; Ingo Ruczinski; Charles Kooperberg; Brian A. Fox; Chris Bystroff; David Baker

doi:10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins

Kim T. Simons, Ingo Ruczinski, Charles Kooperberg, Brian A. Fox, Chris Bystroff, David Baker

Research output: Contribution to journal › Article › peer-review

362 Scopus citations

Abstract

We describe the development of a scoring function based on the decomposition P(structure/sequence) ∞ P(sequence/structure) α P(structure), which outperforms previous scoring functions in correctly identifying native- like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of β-strands into β- sheets. The efficacies of a wide variety of sequence-dependent and sequence- independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of ~30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.

Original language	English (US)
Pages (from-to)	82-95
Number of pages	14
Journal	Proteins: Structure, Function and Genetics
Volume	34
Issue number	1
DOIs	https://doi.org/10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A
State	Published - Jan 1 1999
Externally published	Yes

Keywords

Fold recognition
Knowledge-based scoring functions
Protein folding
Structure prediction

ASJC Scopus subject areas

Structural Biology
Biochemistry
Molecular Biology

Access to Document

10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

Cite this

Simons, K. T., Ruczinski, I., Kooperberg, C., Fox, B. A., Bystroff, C., & Baker, D. (1999). Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins. Proteins: Structure, Function and Genetics, 34(1), 82-95. https://doi.org/10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

Simons, KT, Ruczinski, I, Kooperberg, C, Fox, BA, Bystroff, C & Baker, D 1999, 'Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins', Proteins: Structure, Function and Genetics, vol. 34, no. 1, pp. 82-95. https://doi.org/10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

@article{129def4b5d7540a68bbbbe7459d8f162,

title = "Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins",

abstract = "We describe the development of a scoring function based on the decomposition P(structure/sequence) ∞ P(sequence/structure) α P(structure), which outperforms previous scoring functions in correctly identifying native- like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of β-strands into β- sheets. The efficacies of a wide variety of sequence-dependent and sequence- independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of ~30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.",

keywords = "Fold recognition, Knowledge-based scoring functions, Protein folding, Structure prediction",

author = "Simons, {Kim T.} and Ingo Ruczinski and Charles Kooperberg and Fox, {Brian A.} and Chris Bystroff and David Baker",

year = "1999",

month = jan,

day = "1",

doi = "10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A",

language = "English (US)",

volume = "34",

pages = "82--95",

journal = "Proteins: Structure, Function and Genetics",

issn = "0887-3585",

publisher = "Wiley-Liss Inc.",

number = "1",

}

TY - JOUR

T1 - Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins

AU - Simons, Kim T.

AU - Ruczinski, Ingo

AU - Kooperberg, Charles

AU - Fox, Brian A.

AU - Bystroff, Chris

AU - Baker, David

PY - 1999/1/1

Y1 - 1999/1/1

N2 - We describe the development of a scoring function based on the decomposition P(structure/sequence) ∞ P(sequence/structure) α P(structure), which outperforms previous scoring functions in correctly identifying native- like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of β-strands into β- sheets. The efficacies of a wide variety of sequence-dependent and sequence- independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of ~30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.

AB - We describe the development of a scoring function based on the decomposition P(structure/sequence) ∞ P(sequence/structure) α P(structure), which outperforms previous scoring functions in correctly identifying native- like protein structures in large ensembles of compact decoys. The first term captures sequence-dependent features of protein structures, such as the burial of hydrophobic residues in the core, the second term, universal sequence-independent features, such as the assembly of β-strands into β- sheets. The efficacies of a wide variety of sequence-dependent and sequence- independent features of protein structures for recognizing native-like structures were systematically evaluated using ensembles of ~30,000 compact conformations with fixed secondary structure for each of 17 small protein domains. The best results were obtained using a core scoring function with P(sequence/structure) parameterized similarly to our previous work (Simons et al., J Mol Biol 1997;268:209-225] and P(structure) focused on secondary structure packing preferences; while several additional features had some discriminatory power on their own, they did not provide any additional discriminatory power when combined with the core scoring function. Our results, on both the training set and the independent decoy set of Park and Levitt (J Mol Biol 1996;258:367-392), suggest that this scoring function should contribute to the prediction of tertiary structure from knowledge of sequence and secondary structure.

KW - Fold recognition

KW - Knowledge-based scoring functions

KW - Protein folding

KW - Structure prediction

UR - http://www.scopus.com/inward/record.url?scp=0032929780&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032929780&partnerID=8YFLogxK

U2 - 10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

DO - 10.1002/(SICI)1097-0134(19990101)34:1<82::AID-PROT7>3.0.CO;2-A

M3 - Article

C2 - 10336385

AN - SCOPUS:0032929780

SN - 0887-3585

VL - 34

SP - 82

EP - 95

JO - Proteins: Structure, Function and Genetics

JF - Proteins: Structure, Function and Genetics

IS - 1

ER -

Improved recognition of native-like protein structures using a combination of sequence-dependent and sequence-independent features of proteins

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this