Filtering genetic variants and placing informative priors based on putative biological function

Stefanie Friedrichs, Dörthe Malzahn, Elizabeth Pugh, Marcio Almeida, Xiao Qing Liu, Julia N. Bailey

Research output: Contribution to journalArticle

Abstract

High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.

Original languageEnglish (US)
Article number8
JournalBMC genetics
Volume17
Issue number2
DOIs
StatePublished - Feb 3 2016

Fingerprint

Weights and Measures
Blood Pressure
Gene Expression
Linkage Disequilibrium
Genetic Markers
Genes
Joints
Hypertension
Education
Power (Psychology)

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Friedrichs, S., Malzahn, D., Pugh, E., Almeida, M., Liu, X. Q., & Bailey, J. N. (2016). Filtering genetic variants and placing informative priors based on putative biological function. BMC genetics, 17(2), [8]. https://doi.org/10.1186/s12863-015-0313-x

Filtering genetic variants and placing informative priors based on putative biological function. / Friedrichs, Stefanie; Malzahn, Dörthe; Pugh, Elizabeth; Almeida, Marcio; Liu, Xiao Qing; Bailey, Julia N.

In: BMC genetics, Vol. 17, No. 2, 8, 03.02.2016.

Research output: Contribution to journalArticle

Friedrichs, S, Malzahn, D, Pugh, E, Almeida, M, Liu, XQ & Bailey, JN 2016, 'Filtering genetic variants and placing informative priors based on putative biological function', BMC genetics, vol. 17, no. 2, 8. https://doi.org/10.1186/s12863-015-0313-x
Friedrichs, Stefanie ; Malzahn, Dörthe ; Pugh, Elizabeth ; Almeida, Marcio ; Liu, Xiao Qing ; Bailey, Julia N. / Filtering genetic variants and placing informative priors based on putative biological function. In: BMC genetics. 2016 ; Vol. 17, No. 2.
@article{7c4b9e248c8745b795c4335c0b7c5305,
title = "Filtering genetic variants and placing informative priors based on putative biological function",
abstract = "High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.",
author = "Stefanie Friedrichs and D{\"o}rthe Malzahn and Elizabeth Pugh and Marcio Almeida and Liu, {Xiao Qing} and Bailey, {Julia N.}",
year = "2016",
month = "2",
day = "3",
doi = "10.1186/s12863-015-0313-x",
language = "English (US)",
volume = "17",
journal = "BMC Genetics",
issn = "1471-2156",
publisher = "BioMed Central",
number = "2",

}

TY - JOUR

T1 - Filtering genetic variants and placing informative priors based on putative biological function

AU - Friedrichs, Stefanie

AU - Malzahn, Dörthe

AU - Pugh, Elizabeth

AU - Almeida, Marcio

AU - Liu, Xiao Qing

AU - Bailey, Julia N.

PY - 2016/2/3

Y1 - 2016/2/3

N2 - High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.

AB - High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.

UR - http://www.scopus.com/inward/record.url?scp=84956651957&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84956651957&partnerID=8YFLogxK

U2 - 10.1186/s12863-015-0313-x

DO - 10.1186/s12863-015-0313-x

M3 - Article

C2 - 26866982

AN - SCOPUS:84956651957

VL - 17

JO - BMC Genetics

JF - BMC Genetics

SN - 1471-2156

IS - 2

M1 - 8

ER -