Abstract
Intrinsically disordered proteins are an important class of proteins with unique functions and properties. Here, we have applied a support vector machine (SVM) trained on naturally occurring disordered and ordered proteins to examine the contribution of various parameters (vectors) to recognizing proteins that contain disordered regions. We find that a SVM that incorporates only amino acid composition has a recognition accuracy of 87±2%. This result suggests that composition alone is sufficient to accurately recognize disorder. Interestingly, SVMs using reduced sets of amino acids based on chemical similarity preserve high recognition accuracy. A set as small as four retains an accuracy of 84±2%; this suggests that general physicochemical properties rather than specific amino acids are important factors contributing to protein disorder.
Original language | English (US) |
---|---|
Pages (from-to) | 348-352 |
Number of pages | 5 |
Journal | FEBS Letters |
Volume | 576 |
Issue number | 3 |
DOIs | |
State | Published - Oct 22 2004 |
Keywords
- Amino acid composition
- Protein classification
- Sequence complexity
- Support vector machine
- Unstructured protein
ASJC Scopus subject areas
- Biophysics
- Structural Biology
- Biochemistry
- Molecular Biology
- Genetics
- Cell Biology