Gradients do grow on trees: A linear-time O(N)-dimensional gradient for statistical phylogenetics

Xiang Ji, Zhenyu Zhang, Andrew Holbrook, Akihiko Nishimura, Guy Baele, Andrew Rambaut, Philippe Lemey, Marc A. Suchard

Research output: Contribution to journalArticlepeer-review

Abstract

Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O(N)-dimensional gradient calculations based on the standard pruning algorithm require O(N2) operations, wheres N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend toward even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make these analyses tractable, we present a linear-time algorithm for O(N)-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus, and Lassa virus. Our proposed algorithmsignificantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.

Original languageEnglish (US)
Pages (from-to)3047-3060
Number of pages14
JournalMolecular biology and evolution
Volume37
Issue number10
DOIs
StatePublished - Oct 1 2020

Keywords

  • Bayesian inference
  • Linear-time gradient algorithm
  • Maximum likelihood
  • Random-effects molecular clock model

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Fingerprint Dive into the research topics of 'Gradients do grow on trees: A linear-time O(N)-dimensional gradient for statistical phylogenetics'. Together they form a unique fingerprint.

Cite this