Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

Abhirup Datta; Sudipto Banerjee; Andrew O. Finley; Alan E. Gelfand

doi:10.1080/01621459.2015.1044091

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

Abhirup Datta, Sudipto Banerjee, Andrew O. Finley, Alan E. Gelfand

Research output: Contribution to journal › Article › peer-review

151 Scopus citations

Abstract

Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.

Original language	English (US)
Pages (from-to)	800-812
Number of pages	13
Journal	Journal of the American Statistical Association
Volume	111
Issue number	514
DOIs	https://doi.org/10.1080/01621459.2015.1044091
State	Published - Apr 2 2016
Externally published	Yes

Keywords

Bayesian modeling
Gaussian process
Hierarchical models
Markov chain Monte Carlo
Nearest neighbors
Predictive process
Reduced-rank models
Sparse precision matrices
Spatial cross-covariance functions

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1080/01621459.2015.1044091

Cite this

@article{4dd7091fd9f7427aac49f63c718db797,

title = "Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets",

abstract = "Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.",

keywords = "Bayesian modeling, Gaussian process, Hierarchical models, Markov chain Monte Carlo, Nearest neighbors, Predictive process, Reduced-rank models, Sparse precision matrices, Spatial cross-covariance functions",

author = "Abhirup Datta and Sudipto Banerjee and Finley, {Andrew O.} and Gelfand, {Alan E.}",

note = "Publisher Copyright: {\textcopyright} 2016, {\textcopyright} American Statistical Association.",

year = "2016",

month = apr,

day = "2",

doi = "10.1080/01621459.2015.1044091",

language = "English (US)",

volume = "111",

pages = "800--812",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "514",

}

TY - JOUR

T1 - Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

AU - Datta, Abhirup

AU - Banerjee, Sudipto

AU - Finley, Andrew O.

AU - Gelfand, Alan E.

PY - 2016/4/2

Y1 - 2016/4/2

N2 - Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.

AB - Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.

KW - Bayesian modeling

KW - Gaussian process

KW - Hierarchical models

KW - Markov chain Monte Carlo

KW - Nearest neighbors

KW - Predictive process

KW - Reduced-rank models

KW - Sparse precision matrices

KW - Spatial cross-covariance functions

UR - http://www.scopus.com/inward/record.url?scp=84983283605&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983283605&partnerID=8YFLogxK

U2 - 10.1080/01621459.2015.1044091

DO - 10.1080/01621459.2015.1044091

M3 - Article

AN - SCOPUS:84983283605

SN - 0162-1459

VL - 111

SP - 800

EP - 812

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 514

ER -

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this