Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed

Stephanie Hicks; David A. Wheeler; Sharon E. Plon; Marek Kimmel

doi:10.1002/humu.21490

Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed

Stephanie Hicks, David A. Wheeler, Sharon E. Plon, Marek Kimmel

Research output: Contribution to journal › Article › peer-review

156 Scopus citations

Abstract

Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.

Original language	English (US)
Pages (from-to)	661-668
Number of pages	8
Journal	Human mutation
Volume	32
Issue number	6
DOIs	https://doi.org/10.1002/humu.21490
State	Published - Jun 2011
Externally published	Yes

Keywords

Align-GVGD
BRCA1
MLH1
MSH2
Multiple sequence alignment
PolyPhen-2
SIFT
TP53
Xvar

ASJC Scopus subject areas

Genetics
Genetics(clinical)

Access to Document

10.1002/humu.21490

Cite this

@article{5acc484fc5ab42e18e2f063fdcc99e8e,

title = "Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed",

abstract = "Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.",

keywords = "Align-GVGD, BRCA1, MLH1, MSH2, Multiple sequence alignment, PolyPhen-2, SIFT, TP53, Xvar",

author = "Stephanie Hicks and Wheeler, {David A.} and Plon, {Sharon E.} and Marek Kimmel",

year = "2011",

month = jun,

doi = "10.1002/humu.21490",

language = "English (US)",

volume = "32",

pages = "661--668",

journal = "Human mutation",

issn = "1059-7794",

publisher = "Wiley-Liss Inc.",

number = "6",

}

TY - JOUR

T1 - Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed

AU - Hicks, Stephanie

AU - Wheeler, David A.

AU - Plon, Sharon E.

AU - Kimmel, Marek

PY - 2011/6

Y1 - 2011/6

N2 - Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.

AB - Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.

KW - Align-GVGD

KW - BRCA1

KW - MLH1

KW - MSH2

KW - Multiple sequence alignment

KW - PolyPhen-2

KW - SIFT

KW - TP53

KW - Xvar

UR - http://www.scopus.com/inward/record.url?scp=79957621519&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79957621519&partnerID=8YFLogxK

U2 - 10.1002/humu.21490

DO - 10.1002/humu.21490

M3 - Article

C2 - 21480434

AN - SCOPUS:79957621519

SN - 1059-7794

VL - 32

SP - 661

EP - 668

JO - Human mutation

JF - Human mutation

IS - 6

ER -

Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this