TY - JOUR
T1 - High-throughput prediction of MHC Class I and Class II neoantigens with MHCnuggets
AU - Shao, X. M.
AU - Bhattacharya, R.
AU - Huang, J.
AU - Sivakumar, I. K.A.
AU - Tokheim, C.
AU - Zheng, L.
AU - Hirsch, D.
AU - Kaminow, B.
AU - Omdahl, A.
AU - Bonsack, M.
AU - Riemer, A. B.
AU - Velculescu, V. E.
AU - Anagnostou, V.
AU - Pagel, K. A.
AU - Karchin, R.
N1 - Publisher Copyright:
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019/8/31
Y1 - 2019/8/31
N2 - Computational prediction of binding between neoantigen peptides and major histocompatibility complex (MHC) proteins is an emerging biomarker for predicting patient response to cancer immunotherapy. Current neoantigen predictors focus on in silico estimation of MHC binding affinity and are limited by low positive predictive value for actual peptide presentation, inadequate support for rare MHC alleles and poor scalability to high-throughput data sets. To address these limitations, we developed MHCnuggets, a deep neural network method to predict peptide-MHC binding. MHCnuggets is the only method to handle binding prediction for common or rare alleles of MHC Class I or II, with a single neural network architecture. Using a long short-term memory network (LSTM), MHCnuggets accepts peptides of variable length and is capable of faster performance than other methods. When compared to methods that integrate binding affinity and HLAp data from mass spectrometry, MHCnuggets yields a fourfold increase in positive predictive value on independent MHC-bound peptide (HLAp) data. We applied MHCnuggets to 26 cancer types in TCGA, processing 26.3 million allele-peptide comparisons in under 2.3 hours, yielding 101,326 unique candidate immunogenic missense mutations (IMMs). Predicted-IMM hotspots occurred in 38 genes, including 24 driver genes. Predicted-IMM load was significantly associated with increased immune cell infiltration (p<2e-16) including CD8+ T cells. Notably, only 0.16% of predicted immunogenic missense mutations were observed in >2 patients, with 61.7% of these derived from driver mutations. Our results provide a new method for neoantigen prediction with high performance characteristics and demonstrate its utility in large data sets across human cancers.
AB - Computational prediction of binding between neoantigen peptides and major histocompatibility complex (MHC) proteins is an emerging biomarker for predicting patient response to cancer immunotherapy. Current neoantigen predictors focus on in silico estimation of MHC binding affinity and are limited by low positive predictive value for actual peptide presentation, inadequate support for rare MHC alleles and poor scalability to high-throughput data sets. To address these limitations, we developed MHCnuggets, a deep neural network method to predict peptide-MHC binding. MHCnuggets is the only method to handle binding prediction for common or rare alleles of MHC Class I or II, with a single neural network architecture. Using a long short-term memory network (LSTM), MHCnuggets accepts peptides of variable length and is capable of faster performance than other methods. When compared to methods that integrate binding affinity and HLAp data from mass spectrometry, MHCnuggets yields a fourfold increase in positive predictive value on independent MHC-bound peptide (HLAp) data. We applied MHCnuggets to 26 cancer types in TCGA, processing 26.3 million allele-peptide comparisons in under 2.3 hours, yielding 101,326 unique candidate immunogenic missense mutations (IMMs). Predicted-IMM hotspots occurred in 38 genes, including 24 driver genes. Predicted-IMM load was significantly associated with increased immune cell infiltration (p<2e-16) including CD8+ T cells. Notably, only 0.16% of predicted immunogenic missense mutations were observed in >2 patients, with 61.7% of these derived from driver mutations. Our results provide a new method for neoantigen prediction with high performance characteristics and demonstrate its utility in large data sets across human cancers.
UR - http://www.scopus.com/inward/record.url?scp=85095663266&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095663266&partnerID=8YFLogxK
U2 - 10.1101/752469
DO - 10.1101/752469
M3 - Article
AN - SCOPUS:85095663266
JO - Advances in Water Resources
JF - Advances in Water Resources
SN - 0309-1708
ER -