Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein

Edward A. Weathers, Michael E. Paulaitis, Thomas B. Woolf, Jan H. Hoh

Research output: Contribution to journalArticlepeer-review

Abstract

Intrinsically disordered proteins are an important class of proteins with unique functions and properties. Here, we have applied a support vector machine (SVM) trained on naturally occurring disordered and ordered proteins to examine the contribution of various parameters (vectors) to recognizing proteins that contain disordered regions. We find that a SVM that incorporates only amino acid composition has a recognition accuracy of 87±2%. This result suggests that composition alone is sufficient to accurately recognize disorder. Interestingly, SVMs using reduced sets of amino acids based on chemical similarity preserve high recognition accuracy. A set as small as four retains an accuracy of 84±2%; this suggests that general physicochemical properties rather than specific amino acids are important factors contributing to protein disorder.

Original languageEnglish (US)
Pages (from-to)348-352
Number of pages5
JournalFEBS Letters
Volume576
Issue number3
DOIs
StatePublished - Oct 22 2004

Keywords

  • Amino acid composition
  • Protein classification
  • Sequence complexity
  • Support vector machine
  • Unstructured protein

ASJC Scopus subject areas

  • Biophysics
  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Genetics
  • Cell Biology

Fingerprint Dive into the research topics of 'Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein'. Together they form a unique fingerprint.

Cite this