Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data

Zhenqiu Liu, Fengzhu Sun, Jonathan Braun, Dermot P.B. McGovern, Steven Piantadosi

    Research output: Contribution to journalArticlepeer-review

    10 Scopus citations

    Abstract

    Motivation: Identifying disease associated taxa and constructing networks for bacteria interactions are two important tasks usually studied separately. In reality, differentiation of disease associated taxa and correlation among taxa may affect each other. One genus can be differentiated because it is highly correlated with another highly differentiated one. In addition, network structures may vary under different clinical conditions. Permutation tests are commonly used to detect differences between networks in distinct phenotypes, and they are time-consuming. Results: In this manuscript, we propose a multilevel regularized regression method to simultaneously identify taxa and construct networks. We also extend the framework to allow construction of a common network and differentiated network together. An efficient algorithm with dual formulation is developed to deal with the large-scale n 蠐 m problem with a large number of taxa (m) and a small number of samples (n) efficiently. The proposed method is regularized with a general Lp (p ∈ [0,2]) penalty and models the effects of taxa abundance differentiation and correlation jointly. We demonstrate that it can identify both true and biologically significant genera and network structures.

    Original languageEnglish (US)
    Pages (from-to)1067-1074
    Number of pages8
    JournalBioinformatics
    Volume31
    Issue number7
    DOIs
    StatePublished - Apr 1 2015

    ASJC Scopus subject areas

    • Statistics and Probability
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Computational Theory and Mathematics
    • Computational Mathematics

    Fingerprint

    Dive into the research topics of 'Multilevel regularized regression for simultaneous taxa selection and network construction with metagenomic count data'. Together they form a unique fingerprint.

    Cite this