Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek; Konrad J. Karczewski; Eric V. Minikel; Kaitlin E. Samocha; Eric Banks; Timothy Fennell; Anne H. O'Donnell-Luria; James S. Ware; Andrew J. Hill; Beryl B. Cummings; Taru Tukiainen; Daniel P. Birnbaum; Jack A. Kosmicki; Laramie E. Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David N. Cooper; Nicole Deflaux; Mark DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel Howrigan; Adam Kiezun; Mitja I. Kurki; Ami Levy Moonshine; Pradeep Natarajan; Lorena Orozco; Gina M. Peloso; Ryan Poplin; Manuel A. Rivas; Valentin Ruano-Rubio; Samuel A. Rose; Douglas M. Ruderfer; Khalid Shakir; Peter D. Stenson; Christine Stevens; Brett P. Thomas; Grace Tiao; Maria T. Tusie-Luna; Ben Weisburd; Hong Hee Won; Dongmei Yu; David M. Altshuler; Diego Ardissino; Michael Boehnke; John Danesh; Stacey Donnelly; Roberto Elosua; Jose C. Florez; Stacey B. Gabriel; Gad Getz; Stephen J. Glatt; Christina M. Hultman; Sekar Kathiresan; Markku Laakso; Steven McCarroll; Mark I. McCarthy; Dermot McGovern; Ruth McPherson; Benjamin M. Neale; Aarno Palotie; Shaun M. Purcell; Danish Saleheen; Jeremiah M. Scharf; Pamela Sklar; Patrick F. Sullivan; Jaakko Tuomilehto; Ming T. Tsuang; Hugh C. Watkins; James G. Wilson; Mark J. Daly; Daniel G. MacArthur; H. E. Abboud; G. Abecasis; C. A. Aguilar-Salinas; O. Arellano-Campos; G. Atzmon; I. Aukrust; C. L. Barr; G. I. Bell; S. Bergen; L. Bjørkhaug; J. Blangero; D. W. Bowden; C. L. Budman; N. P. Burtt; F. Centeno-Cruz; J. C. Chambers; K. Chambert; R. Clarke; R. Collins; G. Coppola; E. J. Córdova; M. L. Cortes; N. J. Cox; R. Duggirala; M. Farrall; J. C. Fernandez-Lopez; P. Fontanillas; T. M. Frayling; N. B. Freimer; C. Fuchsberger; H. García-Ortiz; A. Goel; M. J. Gómez-Vázquez; M. E. González-Villalpando; C. González-Villalpando; M. A. Grados; L. Groop; C. A. Haiman; C. L. Hanis; A. T. Hattersley; B. E. Henderson; J. C. Hopewell; A. Huerta-Chagoya; S. Islas-Andrade; S. B. Jacobs; S. Jalilzadeh; C. P. Jenkinson; J. Moran; S. Jiménez-Morale; A. Kähler; R. A. King; G. Kirov; J. S. Kooner; T. Kyriakou; J. Y. Lee; D. M. Lehman; G. Lyon; W. MacMahon; P. K. Magnusson; A. Mahajan; J. Marrugat; A. Martínez-Hernández; C. A. Mathews; G. McVean; J. B. Meigs; T. Meitinger; E. Mendoza-Caamal; J. M. Mercader; K. L. Mohlke; H. Moreno-Macías; A. P. Morris; L. A. Najmi; P. R. Njølstad; M. C. O'Donovan; M. L. Ordóñez-Sánchez; M. J. Owen; T. Park; D. L. Pauls; D. Posthuma; C. Revilla-Monsalve; L. Riba; S. Ripke; R. Rodríguez-Guillén; M. Rodríguez-Torres; P. Sandor; M. Seielstad; R. Sladek; X. Soberón; T. D. Spector; S. E. Tai; T. M. Teslovich; G. Walford; L. R. Wilkens; A. L. Williams

doi:10.1038/nature19057

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek, Konrad J. Karczewski, Eric V. Minikel, Kaitlin E. Samocha, Eric Banks, Timothy Fennell, Anne H. O'Donnell-Luria, James S. Ware, Andrew J. Hill, Beryl B. Cummings, Taru Tukiainen, Daniel P. Birnbaum, Jack A. Kosmicki, Laramie E. Duncan, Karol Estrada, Fengmei Zhao, James Zou, Emma Pierce-Hoffman, Joanne Berghout, David N. CooperNicole Deflaux, Mark DePristo, Ron Do, Jason Flannick, Menachem Fromer, Laura Gauthier, Jackie Goldstein, Namrata Gupta, Daniel Howrigan, Adam Kiezun, Mitja I. Kurki, Ami Levy Moonshine, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso, Ryan Poplin, Manuel A. Rivas, Valentin Ruano-Rubio, Samuel A. Rose, Douglas M. Ruderfer, Khalid Shakir, Peter D. Stenson, Christine Stevens, Brett P. Thomas, Grace Tiao, Maria T. Tusie-Luna, Ben Weisburd, Hong Hee Won, Dongmei Yu, David M. Altshuler, Diego Ardissino, Michael Boehnke, John Danesh, Stacey Donnelly, Roberto Elosua, Jose C. Florez, Stacey B. Gabriel, Gad Getz, Stephen J. Glatt, Christina M. Hultman, Sekar Kathiresan, Markku Laakso, Steven McCarroll, Mark I. McCarthy, Dermot McGovern, Ruth McPherson, Benjamin M. Neale, Aarno Palotie, Shaun M. Purcell, Danish Saleheen, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan, Jaakko Tuomilehto, Ming T. Tsuang, Hugh C. Watkins, James G. Wilson, Mark J. Daly, Daniel G. MacArthur, H. E. Abboud, G. Abecasis, C. A. Aguilar-Salinas, O. Arellano-Campos, G. Atzmon, I. Aukrust, C. L. Barr, G. I. Bell, S. Bergen, L. Bjørkhaug, J. Blangero, D. W. Bowden, C. L. Budman, N. P. Burtt, F. Centeno-Cruz, J. C. Chambers, K. Chambert, R. Clarke, R. Collins, G. Coppola, E. J. Córdova, M. L. Cortes, N. J. Cox, R. Duggirala, M. Farrall, J. C. Fernandez-Lopez, P. Fontanillas, T. M. Frayling, N. B. Freimer, C. Fuchsberger, H. García-Ortiz, A. Goel, M. J. Gómez-Vázquez, M. E. González-Villalpando, C. González-Villalpando, M. A. Grados, L. Groop, C. A. Haiman, C. L. Hanis, A. T. Hattersley, B. E. Henderson, J. C. Hopewell, A. Huerta-Chagoya, S. Islas-Andrade, S. B. Jacobs, S. Jalilzadeh, C. P. Jenkinson, J. Moran, S. Jiménez-Morale, A. Kähler, R. A. King, G. Kirov, J. S. Kooner, T. Kyriakou, J. Y. Lee, D. M. Lehman, G. Lyon, W. MacMahon, P. K. Magnusson, A. Mahajan, J. Marrugat, A. Martínez-Hernández, C. A. Mathews, G. McVean, J. B. Meigs, T. Meitinger, E. Mendoza-Caamal, J. M. Mercader, K. L. Mohlke, H. Moreno-Macías, A. P. Morris, L. A. Najmi, P. R. Njølstad, M. C. O'Donovan, M. L. Ordóñez-Sánchez, M. J. Owen, T. Park, D. L. Pauls, D. Posthuma, C. Revilla-Monsalve, L. Riba, S. Ripke, R. Rodríguez-Guillén, M. Rodríguez-Torres, P. Sandor, M. Seielstad, R. Sladek, X. Soberón, T. D. Spector, S. E. Tai, T. M. Teslovich, G. Walford, L. R. Wilkens, A. L. Williams

School of Medicine

Research output: Contribution to journal › Article › peer-review

5439 Scopus citations

Abstract

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

Original language	English (US)
Pages (from-to)	285-291
Number of pages	7
Journal	Nature
Volume	536
Issue number	7616
DOIs	https://doi.org/10.1038/nature19057
State	Published - Aug 17 2016

ASJC Scopus subject areas

General

Access to Document

10.1038/nature19057

Cite this

Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., O'Donnell-Luria, A. H., Ware, J. S., Hill, A. J., Cummings, B. B., Tukiainen, T., Birnbaum, D. P., Kosmicki, J. A., Duncan, L. E., Estrada, K., Zhao, F., Zou, J., Pierce-Hoffman, E., Berghout, J., ... Williams, A. L. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature, 536(7616), 285-291. https://doi.org/10.1038/nature19057

Lek, M, Karczewski, KJ, Minikel, EV, Samocha, KE, Banks, E, Fennell, T, O'Donnell-Luria, AH, Ware, JS, Hill, AJ, Cummings, BB, Tukiainen, T, Birnbaum, DP, Kosmicki, JA, Duncan, LE, Estrada, K, Zhao, F, Zou, J, Pierce-Hoffman, E, Berghout, J, Cooper, DN, Deflaux, N, DePristo, M, Do, R, Flannick, J, Fromer, M, Gauthier, L, Goldstein, J, Gupta, N, Howrigan, D, Kiezun, A, Kurki, MI, Moonshine, AL, Natarajan, P, Orozco, L, Peloso, GM, Poplin, R, Rivas, MA, Ruano-Rubio, V, Rose, SA, Ruderfer, DM, Shakir, K, Stenson, PD, Stevens, C, Thomas, BP, Tiao, G, Tusie-Luna, MT, Weisburd, B, Won, HH, Yu, D, Altshuler, DM, Ardissino, D, Boehnke, M, Danesh, J, Donnelly, S, Elosua, R, Florez, JC, Gabriel, SB, Getz, G, Glatt, SJ, Hultman, CM, Kathiresan, S, Laakso, M, McCarroll, S, McCarthy, MI, McGovern, D, McPherson, R, Neale, BM, Palotie, A, Purcell, SM, Saleheen, D, Scharf, JM, Sklar, P, Sullivan, PF, Tuomilehto, J, Tsuang, MT, Watkins, HC, Wilson, JG, Daly, MJ, MacArthur, DG, Abboud, HE, Abecasis, G, Aguilar-Salinas, CA, Arellano-Campos, O, Atzmon, G, Aukrust, I, Barr, CL, Bell, GI, Bergen, S, Bjørkhaug, L, Blangero, J, Bowden, DW, Budman, CL, Burtt, NP, Centeno-Cruz, F, Chambers, JC, Chambert, K, Clarke, R, Collins, R, Coppola, G, Córdova, EJ, Cortes, ML, Cox, NJ, Duggirala, R, Farrall, M, Fernandez-Lopez, JC, Fontanillas, P, Frayling, TM, Freimer, NB, Fuchsberger, C, García-Ortiz, H, Goel, A, Gómez-Vázquez, MJ, González-Villalpando, ME, González-Villalpando, C, Grados, MA, Groop, L, Haiman, CA, Hanis, CL, Hattersley, AT, Henderson, BE, Hopewell, JC, Huerta-Chagoya, A, Islas-Andrade, S, Jacobs, SB, Jalilzadeh, S, Jenkinson, CP, Moran, J, Jiménez-Morale, S, Kähler, A, King, RA, Kirov, G, Kooner, JS, Kyriakou, T, Lee, JY, Lehman, DM, Lyon, G, MacMahon, W, Magnusson, PK, Mahajan, A, Marrugat, J, Martínez-Hernández, A, Mathews, CA, McVean, G, Meigs, JB, Meitinger, T, Mendoza-Caamal, E, Mercader, JM, Mohlke, KL, Moreno-Macías, H, Morris, AP, Najmi, LA, Njølstad, PR, O'Donovan, MC, Ordóñez-Sánchez, ML, Owen, MJ, Park, T, Pauls, DL, Posthuma, D, Revilla-Monsalve, C, Riba, L, Ripke, S, Rodríguez-Guillén, R, Rodríguez-Torres, M, Sandor, P, Seielstad, M, Sladek, R, Soberón, X, Spector, TD, Tai, SE, Teslovich, TM, Walford, G, Wilkens, LR & Williams, AL 2016, 'Analysis of protein-coding genetic variation in 60,706 humans', Nature, vol. 536, no. 7616, pp. 285-291. https://doi.org/10.1038/nature19057

@article{85edd7831b054ab397cf46ec550b3d17,

title = "Analysis of protein-coding genetic variation in 60,706 humans",

abstract = "Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.",

author = "Monkol Lek and Karczewski, {Konrad J.} and Minikel, {Eric V.} and Samocha, {Kaitlin E.} and Eric Banks and Timothy Fennell and O'Donnell-Luria, {Anne H.} and Ware, {James S.} and Hill, {Andrew J.} and Cummings, {Beryl B.} and Taru Tukiainen and Birnbaum, {Daniel P.} and Kosmicki, {Jack A.} and Duncan, {Laramie E.} and Karol Estrada and Fengmei Zhao and James Zou and Emma Pierce-Hoffman and Joanne Berghout and Cooper, {David N.} and Nicole Deflaux and Mark DePristo and Ron Do and Jason Flannick and Menachem Fromer and Laura Gauthier and Jackie Goldstein and Namrata Gupta and Daniel Howrigan and Adam Kiezun and Kurki, {Mitja I.} and Moonshine, {Ami Levy} and Pradeep Natarajan and Lorena Orozco and Peloso, {Gina M.} and Ryan Poplin and Rivas, {Manuel A.} and Valentin Ruano-Rubio and Rose, {Samuel A.} and Ruderfer, {Douglas M.} and Khalid Shakir and Stenson, {Peter D.} and Christine Stevens and Thomas, {Brett P.} and Grace Tiao and Tusie-Luna, {Maria T.} and Ben Weisburd and Won, {Hong Hee} and Dongmei Yu and Altshuler, {David M.} and Diego Ardissino and Michael Boehnke and John Danesh and Stacey Donnelly and Roberto Elosua and Florez, {Jose C.} and Gabriel, {Stacey B.} and Gad Getz and Glatt, {Stephen J.} and Hultman, {Christina M.} and Sekar Kathiresan and Markku Laakso and Steven McCarroll and McCarthy, {Mark I.} and Dermot McGovern and Ruth McPherson and Neale, {Benjamin M.} and Aarno Palotie and Purcell, {Shaun M.} and Danish Saleheen and Scharf, {Jeremiah M.} and Pamela Sklar and Sullivan, {Patrick F.} and Jaakko Tuomilehto and Tsuang, {Ming T.} and Watkins, {Hugh C.} and Wilson, {James G.} and Daly, {Mark J.} and MacArthur, {Daniel G.} and Abboud, {H. E.} and G. Abecasis and Aguilar-Salinas, {C. A.} and O. Arellano-Campos and G. Atzmon and I. Aukrust and Barr, {C. L.} and Bell, {G. I.} and S. Bergen and L. Bj{\o}rkhaug and J. Blangero and Bowden, {D. W.} and Budman, {C. L.} and Burtt, {N. P.} and F. Centeno-Cruz and Chambers, {J. C.} and K. Chambert and R. Clarke and R. Collins and G. Coppola and C{\'o}rdova, {E. J.} and Cortes, {M. L.} and Cox, {N. J.} and R. Duggirala and M. Farrall and Fernandez-Lopez, {J. C.} and P. Fontanillas and Frayling, {T. M.} and Freimer, {N. B.} and C. Fuchsberger and H. Garc{\'i}a-Ortiz and A. Goel and G{\'o}mez-V{\'a}zquez, {M. J.} and Gonz{\'a}lez-Villalpando, {M. E.} and C. Gonz{\'a}lez-Villalpando and Grados, {M. A.} and L. Groop and Haiman, {C. A.} and Hanis, {C. L.} and Hattersley, {A. T.} and Henderson, {B. E.} and Hopewell, {J. C.} and A. Huerta-Chagoya and S. Islas-Andrade and Jacobs, {S. B.} and S. Jalilzadeh and Jenkinson, {C. P.} and J. Moran and S. Jim{\'e}nez-Morale and A. K{\"a}hler and King, {R. A.} and G. Kirov and Kooner, {J. S.} and T. Kyriakou and Lee, {J. Y.} and Lehman, {D. M.} and G. Lyon and W. MacMahon and Magnusson, {P. K.} and A. Mahajan and J. Marrugat and A. Mart{\'i}nez-Hern{\'a}ndez and Mathews, {C. A.} and G. McVean and Meigs, {J. B.} and T. Meitinger and E. Mendoza-Caamal and Mercader, {J. M.} and Mohlke, {K. L.} and H. Moreno-Mac{\'i}as and Morris, {A. P.} and Najmi, {L. A.} and Nj{\o}lstad, {P. R.} and O'Donovan, {M. C.} and Ord{\'o}{\~n}ez-S{\'a}nchez, {M. L.} and Owen, {M. J.} and T. Park and Pauls, {D. L.} and D. Posthuma and C. Revilla-Monsalve and L. Riba and S. Ripke and R. Rodr{\'i}guez-Guill{\'e}n and M. Rodr{\'i}guez-Torres and P. Sandor and M. Seielstad and R. Sladek and X. Sober{\'o}n and Spector, {T. D.} and Tai, {S. E.} and Teslovich, {T. M.} and G. Walford and Wilkens, {L. R.} and Williams, {A. L.}",

year = "2016",

month = aug,

day = "17",

doi = "10.1038/nature19057",

language = "English (US)",

volume = "536",

pages = "285--291",

journal = "Nature",

issn = "0028-0836",

publisher = "Nature Publishing Group",

number = "7616",

}

TY - JOUR

T1 - Analysis of protein-coding genetic variation in 60,706 humans

AU - Lek, Monkol

AU - Karczewski, Konrad J.

AU - Minikel, Eric V.

AU - Samocha, Kaitlin E.

AU - Banks, Eric

AU - Fennell, Timothy

AU - O'Donnell-Luria, Anne H.

AU - Ware, James S.

AU - Hill, Andrew J.

AU - Cummings, Beryl B.

AU - Tukiainen, Taru

AU - Birnbaum, Daniel P.

AU - Kosmicki, Jack A.

AU - Duncan, Laramie E.

AU - Estrada, Karol

AU - Zhao, Fengmei

AU - Zou, James

AU - Pierce-Hoffman, Emma

AU - Berghout, Joanne

AU - Cooper, David N.

AU - Deflaux, Nicole

AU - DePristo, Mark

AU - Do, Ron

AU - Flannick, Jason

AU - Fromer, Menachem

AU - Gauthier, Laura

AU - Goldstein, Jackie

AU - Gupta, Namrata

AU - Howrigan, Daniel

AU - Kiezun, Adam

AU - Kurki, Mitja I.

AU - Moonshine, Ami Levy

AU - Natarajan, Pradeep

AU - Orozco, Lorena

AU - Peloso, Gina M.

AU - Poplin, Ryan

AU - Rivas, Manuel A.

AU - Ruano-Rubio, Valentin

AU - Rose, Samuel A.

AU - Ruderfer, Douglas M.

AU - Shakir, Khalid

AU - Stenson, Peter D.

AU - Stevens, Christine

AU - Thomas, Brett P.

AU - Tiao, Grace

AU - Tusie-Luna, Maria T.

AU - Weisburd, Ben

AU - Won, Hong Hee

AU - Yu, Dongmei

AU - Altshuler, David M.

AU - Ardissino, Diego

AU - Boehnke, Michael

AU - Danesh, John

AU - Donnelly, Stacey

AU - Elosua, Roberto

AU - Florez, Jose C.

AU - Gabriel, Stacey B.

AU - Getz, Gad

AU - Glatt, Stephen J.

AU - Hultman, Christina M.

AU - Kathiresan, Sekar

AU - Laakso, Markku

AU - McCarroll, Steven

AU - McCarthy, Mark I.

AU - McGovern, Dermot

AU - McPherson, Ruth

AU - Neale, Benjamin M.

AU - Palotie, Aarno

AU - Purcell, Shaun M.

AU - Saleheen, Danish

AU - Scharf, Jeremiah M.

AU - Sklar, Pamela

AU - Sullivan, Patrick F.

AU - Tuomilehto, Jaakko

AU - Tsuang, Ming T.

AU - Watkins, Hugh C.

AU - Wilson, James G.

AU - Daly, Mark J.

AU - MacArthur, Daniel G.

AU - Abboud, H. E.

AU - Abecasis, G.

AU - Aguilar-Salinas, C. A.

AU - Arellano-Campos, O.

AU - Atzmon, G.

AU - Aukrust, I.

AU - Barr, C. L.

AU - Bell, G. I.

AU - Bergen, S.

AU - Bjørkhaug, L.

AU - Blangero, J.

AU - Bowden, D. W.

AU - Budman, C. L.

AU - Burtt, N. P.

AU - Centeno-Cruz, F.

AU - Chambers, J. C.

AU - Chambert, K.

AU - Clarke, R.

AU - Collins, R.

AU - Coppola, G.

AU - Córdova, E. J.

AU - Cortes, M. L.

AU - Cox, N. J.

AU - Duggirala, R.

AU - Farrall, M.

AU - Fernandez-Lopez, J. C.

AU - Fontanillas, P.

AU - Frayling, T. M.

AU - Freimer, N. B.

AU - Fuchsberger, C.

AU - García-Ortiz, H.

AU - Goel, A.

AU - Gómez-Vázquez, M. J.

AU - González-Villalpando, M. E.

AU - González-Villalpando, C.

AU - Grados, M. A.

AU - Groop, L.

AU - Haiman, C. A.

AU - Hanis, C. L.

AU - Hattersley, A. T.

AU - Henderson, B. E.

AU - Hopewell, J. C.

AU - Huerta-Chagoya, A.

AU - Islas-Andrade, S.

AU - Jacobs, S. B.

AU - Jalilzadeh, S.

AU - Jenkinson, C. P.

AU - Moran, J.

AU - Jiménez-Morale, S.

AU - Kähler, A.

AU - King, R. A.

AU - Kirov, G.

AU - Kooner, J. S.

AU - Kyriakou, T.

AU - Lee, J. Y.

AU - Lehman, D. M.

AU - Lyon, G.

AU - MacMahon, W.

AU - Magnusson, P. K.

AU - Mahajan, A.

AU - Marrugat, J.

AU - Martínez-Hernández, A.

AU - Mathews, C. A.

AU - McVean, G.

AU - Meigs, J. B.

AU - Meitinger, T.

AU - Mendoza-Caamal, E.

AU - Mercader, J. M.

AU - Mohlke, K. L.

AU - Moreno-Macías, H.

AU - Morris, A. P.

AU - Najmi, L. A.

AU - Njølstad, P. R.

AU - O'Donovan, M. C.

AU - Ordóñez-Sánchez, M. L.

AU - Owen, M. J.

AU - Park, T.

AU - Pauls, D. L.

AU - Posthuma, D.

AU - Revilla-Monsalve, C.

AU - Riba, L.

AU - Ripke, S.

AU - Rodríguez-Guillén, R.

AU - Rodríguez-Torres, M.

AU - Sandor, P.

AU - Seielstad, M.

AU - Sladek, R.

AU - Soberón, X.

AU - Spector, T. D.

AU - Tai, S. E.

AU - Teslovich, T. M.

AU - Walford, G.

AU - Wilkens, L. R.

AU - Williams, A. L.

PY - 2016/8/17

Y1 - 2016/8/17

N2 - Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

AB - Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

UR - http://www.scopus.com/inward/record.url?scp=84982253941&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84982253941&partnerID=8YFLogxK

U2 - 10.1038/nature19057

DO - 10.1038/nature19057

M3 - Article

C2 - 27535533

AN - SCOPUS:84982253941

SN - 0028-0836

VL - 536

SP - 285

EP - 291

JO - Nature

JF - Nature

IS - 7616

ER -

Analysis of protein-coding genetic variation in 60,706 humans

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this