On nearest-neighbor Gaussian process models for massive spatial data

Abhirup Datta, Sudipto Banerjee, Andrew O. Finley, Alan E. Gelfand

Research output: Contribution to journalReview article

Abstract

Gaussian Process (GP) models provide a very flexible nonparametric approach to modeling location-and-time indexed datasets. However, the storage and computational requirements for GP models are infeasible for large spatial datasets. Nearest Neighbor Gaussian Processes (Datta A, Banerjee S, Finley AO, Gelfand AE. Hierarchical nearest-neighbor gaussian process models for large geostatistical datasets. J Am Stat Assoc 2016., JASA) provide a scalable alternative by using local information from few nearest neighbors. Scalability is achieved by using the neighbor sets in a conditional specification of the model. We show how this is equivalent to sparse modeling of Cholesky factors of large covariance matrices. We also discuss a general approach to construct scalable Gaussian Processes using sparse local kriging. We present a multivariate data analysis which demonstrates how the nearest neighbor approach yields inference indistinguishable from the full rank GP despite being several times faster. Finally, we also propose a variant of the NNGP model for automating the selection of the neighbor set size. WIREs Comput Stat 2016, 8:162–171. doi: 10.1002/wics.1383. For further resources related to this article, please visit the WIREs website.

Original languageEnglish (US)
Pages (from-to)162-171
Number of pages10
JournalWiley Interdisciplinary Reviews: Computational Statistics
Volume8
Issue number5
DOIs
StatePublished - Sep 1 2016
Externally publishedYes

Keywords

  • Bayesian methods and theory
  • computational Bayesian methods
  • data structures
  • image and spatial data

ASJC Scopus subject areas

  • Statistics and Probability

Fingerprint Dive into the research topics of 'On nearest-neighbor Gaussian process models for massive spatial data'. Together they form a unique fingerprint.

  • Cite this