Spatial factor models for high-dimensional and large spatial data

An application in forest variable mapping

Daniel Taylor-Rodriguez, Andrew O. Finley, Abhirup Datta, Chad Babcock, Hans Erik Andersen, Bruce D. Cook, Douglas C. Morton, Sudipto Banerjee

Research output: Contribution to journalArticle

Abstract

Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (∼102) spatially dependent LiDAR outcomes over a large number of locations (∼105 − 106). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.

Original languageEnglish (US)
Pages (from-to)1155-1180
Number of pages26
JournalStatistica Sinica
Volume29
Issue number3
DOIs
StatePublished - Jan 1 2019

Fingerprint

Factor Models
Spatial Model
Large Data
Spatial Data
High-dimensional
Gaussian Process
Nearest Neighbor
Coverage
Spatial Structure
Gaussian Model
Region of Interest
Process Model
Simulation Experiment
Interior
High Resolution
Uncertainty
Predict
Dependent
Modeling
Demonstrate

Keywords

  • Forest outcomes
  • LiDAR data
  • Nearest neighbor Gaussian processes
  • Spatial prediction

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Taylor-Rodriguez, D., Finley, A. O., Datta, A., Babcock, C., Andersen, H. E., Cook, B. D., ... Banerjee, S. (2019). Spatial factor models for high-dimensional and large spatial data: An application in forest variable mapping. Statistica Sinica, 29(3), 1155-1180. https://doi.org/10.5705/ss.202018.0005

Spatial factor models for high-dimensional and large spatial data : An application in forest variable mapping. / Taylor-Rodriguez, Daniel; Finley, Andrew O.; Datta, Abhirup; Babcock, Chad; Andersen, Hans Erik; Cook, Bruce D.; Morton, Douglas C.; Banerjee, Sudipto.

In: Statistica Sinica, Vol. 29, No. 3, 01.01.2019, p. 1155-1180.

Research output: Contribution to journalArticle

Taylor-Rodriguez, D, Finley, AO, Datta, A, Babcock, C, Andersen, HE, Cook, BD, Morton, DC & Banerjee, S 2019, 'Spatial factor models for high-dimensional and large spatial data: An application in forest variable mapping', Statistica Sinica, vol. 29, no. 3, pp. 1155-1180. https://doi.org/10.5705/ss.202018.0005
Taylor-Rodriguez, Daniel ; Finley, Andrew O. ; Datta, Abhirup ; Babcock, Chad ; Andersen, Hans Erik ; Cook, Bruce D. ; Morton, Douglas C. ; Banerjee, Sudipto. / Spatial factor models for high-dimensional and large spatial data : An application in forest variable mapping. In: Statistica Sinica. 2019 ; Vol. 29, No. 3. pp. 1155-1180.
@article{1c825a3bf9cf40dd94dce34f62f4e0e2,
title = "Spatial factor models for high-dimensional and large spatial data: An application in forest variable mapping",
abstract = "Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (∼102) spatially dependent LiDAR outcomes over a large number of locations (∼105 − 106). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.",
keywords = "Forest outcomes, LiDAR data, Nearest neighbor Gaussian processes, Spatial prediction",
author = "Daniel Taylor-Rodriguez and Finley, {Andrew O.} and Abhirup Datta and Chad Babcock and Andersen, {Hans Erik} and Cook, {Bruce D.} and Morton, {Douglas C.} and Sudipto Banerjee",
year = "2019",
month = "1",
day = "1",
doi = "10.5705/ss.202018.0005",
language = "English (US)",
volume = "29",
pages = "1155--1180",
journal = "Statistica Sinica",
issn = "1017-0405",
publisher = "Institute of Statistical Science",
number = "3",

}

TY - JOUR

T1 - Spatial factor models for high-dimensional and large spatial data

T2 - An application in forest variable mapping

AU - Taylor-Rodriguez, Daniel

AU - Finley, Andrew O.

AU - Datta, Abhirup

AU - Babcock, Chad

AU - Andersen, Hans Erik

AU - Cook, Bruce D.

AU - Morton, Douglas C.

AU - Banerjee, Sudipto

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (∼102) spatially dependent LiDAR outcomes over a large number of locations (∼105 − 106). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.

AB - Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (∼102) spatially dependent LiDAR outcomes over a large number of locations (∼105 − 106). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.

KW - Forest outcomes

KW - LiDAR data

KW - Nearest neighbor Gaussian processes

KW - Spatial prediction

UR - http://www.scopus.com/inward/record.url?scp=85072089536&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072089536&partnerID=8YFLogxK

U2 - 10.5705/ss.202018.0005

DO - 10.5705/ss.202018.0005

M3 - Article

VL - 29

SP - 1155

EP - 1180

JO - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

IS - 3

ER -