TY - JOUR
T1 - Validation of new bioinformatic tools to identify expanded repeats
T2 - A non-reference intronic pentamer expansion in RFC1 causes CANVAS
AU - Rafehi, Haloom
AU - Szmulewicz, David J.
AU - Bennett, Mark F.
AU - Sobreira, Nara L.M.
AU - Pope, Kate
AU - Smith, Katherine R.
AU - Gillies, Greta
AU - Diakumis, Peter
AU - Dolzhenko, Egor
AU - Eberle, Michael A.
AU - Barcina, María García
AU - Breen, David P.
AU - Chancellor, Andrew M.
AU - Cremer, Phillip D.
AU - Delatycki, Martin B.
AU - Fogel, Brent L.
AU - Hackett, Anna
AU - Halmagyi, G. Michael
AU - Kapetanovic, Solange
AU - Lang, Anthony
AU - Mossman, Stuart
AU - Mu, Weiyi
AU - Patrikios, Peter
AU - Perlman, Susan L.
AU - Rosemargy, Ian
AU - Storey, Elsdon
AU - Watson, Shaun R.D.
AU - Wilson, Michael A.
AU - Zee, David
AU - Valle, David
AU - Amor, David J.
AU - Bahlo, Melanie
AU - Lockhart, Paul J.
N1 - Publisher Copyright:
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019/4/4
Y1 - 2019/4/4
N2 - Genomic technologies such as Next Generation Sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)exp] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG)11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS families and identified a core ancestral haplotype, estimated to have arisen in Europe over twenty-five thousand years ago. WGS of the four RFC1 negative CANVAS families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
AB - Genomic technologies such as Next Generation Sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)exp] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG)11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS families and identified a core ancestral haplotype, estimated to have arisen in Europe over twenty-five thousand years ago. WGS of the four RFC1 negative CANVAS families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
KW - ataxia
KW - CANVAS
KW - repeat expansions
KW - short tandem repeats
KW - whole genome sequencing
UR - http://www.scopus.com/inward/record.url?scp=85095638835&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095638835&partnerID=8YFLogxK
U2 - 10.1101/597781
DO - 10.1101/597781
M3 - Article
AN - SCOPUS:85095638835
JO - Advances in Water Resources
JF - Advances in Water Resources
SN - 0309-1708
ER -