Systematic Computational Identification of Variants That Activate Exonic and Intronic Cryptic Splice Sites

Melissa Lee, Patrick Roos, Neeraj Sharma, Melis Atalar, Taylor A. Evans, Matthew J. Pellicore, Emily Davis, Anh Thu N. Lam, Susan E. Stanley, Sara E. Khalil, George M. Solomon, Doug Walker, Karen S. Raraigh, Briana Vecchio-Pagan, Mary Armanios, Garry R. Cutting

Research output: Contribution to journalArticlepeer-review

Abstract

We developed a variant-annotation method that combines sequence-based machine-learning classification with a context-dependent algorithm for selecting splice variants. Our approach is distinctive in that it compares the splice potential of a sequence bearing a variant with the splice potential of the reference sequence. After training, classification accurately identified 168 of 180 (93.3%) canonical splice sites of five genes. The combined method, CryptSplice, identified and correctly predicted the effect of 18 of 21 (86%) known splice-altering variants in CFTR, a well-studied gene whose loss-of-function variants cause cystic fibrosis (CF). Among 1,423 unannotated CFTR disease-associated variants, the method identified 32 potential exonic cryptic splice variants, two of which were experimentally evaluated and confirmed. After complete CFTR sequencing, the method found three cryptic intronic splice variants (one known and two experimentally verified) that completed the molecular diagnosis of CF in 6 of 14 individuals. CryptSplice interrogation of sequence data from six individuals with X-linked dyskeratosis congenita caused by an unknown disease-causing variant in DKC1 identified two splice-altering variants that were experimentally verified. To assess the extent to which disease-associated variants might activate cryptic splicing, we selected 458 pathogenic variants and 348 variants of uncertain significance (VUSs) classified as high confidence from ClinVar. Splice-site activation was predicted for 129 (28%) of the pathogenic variants and 75 (22%) of the VUSs. Our findings suggest that cryptic splice-site activation is more common than previously thought and should be routinely considered for all variants within the transcribed regions of genes.

Original languageEnglish (US)
Pages (from-to)751-765
Number of pages15
JournalAmerican journal of human genetics
Volume100
Issue number5
DOIs
StatePublished - May 4 2017

Keywords

  • cryptic splicing
  • cystic fibrosis
  • machine learning
  • minigene
  • pseudoexon
  • splice acceptor
  • splice donor
  • splice variant
  • splicing

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint Dive into the research topics of 'Systematic Computational Identification of Variants That Activate Exonic and Intronic Cryptic Splice Sites'. Together they form a unique fingerprint.

Cite this