SVSI: Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits

Wenjian Bi, Guolian Kang, Yanlong Zhao, Yuehua Cui, Song Yan, Yun Li, Cheng Cheng, Stanley B. Pounds, Michael J Borowitz, Mary V. Relling, Jun J. Yang, Zhifa Liu, Ching Hon Pui, Stephen P. Hunger, Christine M. Hartford, Wing Leung, Ji Feng Zhang

Research output: Contribution to journalArticle

Abstract

In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I error due to their model assumption and/or instable parameter estimation algorithm when the genetic variant is rare or sample size is limited. To solve this problem, we propose a set-valued (SV) system model to identify genetic variants associated with an ordinal categorical phenotype. We couple this model with a SV system identification algorithm to identify all the key system parameters. Simulations and two real data analyses show that SV and LG accurately controlled the Type I error rate even at a significance level of 10-6 but not oLG and oPRB in some cases. LG had significantly less power than the other three methods due to disregarding of the ordinal nature of the phenotype, and SV had similar or greater power than oLG and oPRB. We argue that SV should be employed in genetic association studies for ordered categorical phenotype.

Original languageEnglish (US)
Pages (from-to)294-309
Number of pages16
JournalAnnals of Human Genetics
Volume79
Issue number4
DOIs
StatePublished - Jul 1 2015

Fingerprint

Phenotype
Logistic Models
Genetic Association Studies
Sample Size

Keywords

  • Genetic association study
  • Multiple thresholds
  • Ordered logistic model
  • Rare variants
  • Set-valued system identification

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics

Cite this

SVSI : Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits. / Bi, Wenjian; Kang, Guolian; Zhao, Yanlong; Cui, Yuehua; Yan, Song; Li, Yun; Cheng, Cheng; Pounds, Stanley B.; Borowitz, Michael J; Relling, Mary V.; Yang, Jun J.; Liu, Zhifa; Pui, Ching Hon; Hunger, Stephen P.; Hartford, Christine M.; Leung, Wing; Zhang, Ji Feng.

In: Annals of Human Genetics, Vol. 79, No. 4, 01.07.2015, p. 294-309.

Research output: Contribution to journalArticle

Bi, W, Kang, G, Zhao, Y, Cui, Y, Yan, S, Li, Y, Cheng, C, Pounds, SB, Borowitz, MJ, Relling, MV, Yang, JJ, Liu, Z, Pui, CH, Hunger, SP, Hartford, CM, Leung, W & Zhang, JF 2015, 'SVSI: Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits', Annals of Human Genetics, vol. 79, no. 4, pp. 294-309. https://doi.org/10.1111/ahg.12117
Bi, Wenjian ; Kang, Guolian ; Zhao, Yanlong ; Cui, Yuehua ; Yan, Song ; Li, Yun ; Cheng, Cheng ; Pounds, Stanley B. ; Borowitz, Michael J ; Relling, Mary V. ; Yang, Jun J. ; Liu, Zhifa ; Pui, Ching Hon ; Hunger, Stephen P. ; Hartford, Christine M. ; Leung, Wing ; Zhang, Ji Feng. / SVSI : Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits. In: Annals of Human Genetics. 2015 ; Vol. 79, No. 4. pp. 294-309.
@article{243985b89f2b462cbae67b98d0cdab5b,
title = "SVSI: Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits",
abstract = "In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I error due to their model assumption and/or instable parameter estimation algorithm when the genetic variant is rare or sample size is limited. To solve this problem, we propose a set-valued (SV) system model to identify genetic variants associated with an ordinal categorical phenotype. We couple this model with a SV system identification algorithm to identify all the key system parameters. Simulations and two real data analyses show that SV and LG accurately controlled the Type I error rate even at a significance level of 10-6 but not oLG and oPRB in some cases. LG had significantly less power than the other three methods due to disregarding of the ordinal nature of the phenotype, and SV had similar or greater power than oLG and oPRB. We argue that SV should be employed in genetic association studies for ordered categorical phenotype.",
keywords = "Genetic association study, Multiple thresholds, Ordered logistic model, Rare variants, Set-valued system identification",
author = "Wenjian Bi and Guolian Kang and Yanlong Zhao and Yuehua Cui and Song Yan and Yun Li and Cheng Cheng and Pounds, {Stanley B.} and Borowitz, {Michael J} and Relling, {Mary V.} and Yang, {Jun J.} and Zhifa Liu and Pui, {Ching Hon} and Hunger, {Stephen P.} and Hartford, {Christine M.} and Wing Leung and Zhang, {Ji Feng}",
year = "2015",
month = "7",
day = "1",
doi = "10.1111/ahg.12117",
language = "English (US)",
volume = "79",
pages = "294--309",
journal = "Annals of Human Genetics",
issn = "0003-4800",
publisher = "Wiley-Blackwell",
number = "4",

}

TY - JOUR

T1 - SVSI

T2 - Fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits

AU - Bi, Wenjian

AU - Kang, Guolian

AU - Zhao, Yanlong

AU - Cui, Yuehua

AU - Yan, Song

AU - Li, Yun

AU - Cheng, Cheng

AU - Pounds, Stanley B.

AU - Borowitz, Michael J

AU - Relling, Mary V.

AU - Yang, Jun J.

AU - Liu, Zhifa

AU - Pui, Ching Hon

AU - Hunger, Stephen P.

AU - Hartford, Christine M.

AU - Leung, Wing

AU - Zhang, Ji Feng

PY - 2015/7/1

Y1 - 2015/7/1

N2 - In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I error due to their model assumption and/or instable parameter estimation algorithm when the genetic variant is rare or sample size is limited. To solve this problem, we propose a set-valued (SV) system model to identify genetic variants associated with an ordinal categorical phenotype. We couple this model with a SV system identification algorithm to identify all the key system parameters. Simulations and two real data analyses show that SV and LG accurately controlled the Type I error rate even at a significance level of 10-6 but not oLG and oPRB in some cases. LG had significantly less power than the other three methods due to disregarding of the ordinal nature of the phenotype, and SV had similar or greater power than oLG and oPRB. We argue that SV should be employed in genetic association studies for ordered categorical phenotype.

AB - In genetic association studies of an ordered categorical phenotype, it is usual to either regroup multiple categories of the phenotype into two categories and then apply the logistic regression (LG), or apply ordered logistic (oLG), or ordered probit (oPRB) regression, which accounts for the ordinal nature of the phenotype. However, they may lose statistical power or may not control type I error due to their model assumption and/or instable parameter estimation algorithm when the genetic variant is rare or sample size is limited. To solve this problem, we propose a set-valued (SV) system model to identify genetic variants associated with an ordinal categorical phenotype. We couple this model with a SV system identification algorithm to identify all the key system parameters. Simulations and two real data analyses show that SV and LG accurately controlled the Type I error rate even at a significance level of 10-6 but not oLG and oPRB in some cases. LG had significantly less power than the other three methods due to disregarding of the ordinal nature of the phenotype, and SV had similar or greater power than oLG and oPRB. We argue that SV should be employed in genetic association studies for ordered categorical phenotype.

KW - Genetic association study

KW - Multiple thresholds

KW - Ordered logistic model

KW - Rare variants

KW - Set-valued system identification

UR - http://www.scopus.com/inward/record.url?scp=84931574679&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84931574679&partnerID=8YFLogxK

U2 - 10.1111/ahg.12117

DO - 10.1111/ahg.12117

M3 - Article

C2 - 25959545

AN - SCOPUS:84931574679

VL - 79

SP - 294

EP - 309

JO - Annals of Human Genetics

JF - Annals of Human Genetics

SN - 0003-4800

IS - 4

ER -