A frequency domain selection criterion for regression with autocorrelated errors

Clifford M. Hurvich; Scott L. Zeger

doi:10.1080/01621459.1990.10474931

A frequency domain selection criterion for regression with autocorrelated errors

Clifford M. Hurvich, Scott L. Zeger

Bloomberg School of Public Health

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

We consider the regression model y_t = η_t + ε_t, t = 0, 1, …, n − 1, where y_t are scalar observations, η_t is the unknown regression function, and ε_t are unobservable errors generated by a zero-mean weakly stationary process, independent of η_t and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of η_t. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal η_t to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of η_t at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.

Original language	English (US)
Pages (from-to)	705-714
Number of pages	10
Journal	Journal of the American Statistical Association
Volume	85
Issue number	411
DOIs	https://doi.org/10.1080/01621459.1990.10474931
State	Published - Sep 1990

Keywords

Cross-validation
Model selection
Nonparametric regression

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1080/01621459.1990.10474931

Cite this

@article{fd518d13c14e42c8805ebb4760d3861b,

title = "A frequency domain selection criterion for regression with autocorrelated errors",

abstract = "We consider the regression model yt = ηt + εt, t = 0, 1, …, n − 1, where yt are scalar observations, ηt is the unknown regression function, and εt are unobservable errors generated by a zero-mean weakly stationary process, independent of ηt and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of ηt. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal ηt to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of ηt at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.",

keywords = "Cross-validation, Model selection, Nonparametric regression",

author = "Hurvich, {Clifford M.} and Zeger, {Scott L.}",

year = "1990",

month = sep,

doi = "10.1080/01621459.1990.10474931",

language = "English (US)",

volume = "85",

pages = "705--714",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "411",

}

TY - JOUR

T1 - A frequency domain selection criterion for regression with autocorrelated errors

AU - Hurvich, Clifford M.

AU - Zeger, Scott L.

PY - 1990/9

Y1 - 1990/9

N2 - We consider the regression model yt = ηt + εt, t = 0, 1, …, n − 1, where yt are scalar observations, ηt is the unknown regression function, and εt are unobservable errors generated by a zero-mean weakly stationary process, independent of ηt and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of ηt. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal ηt to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of ηt at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.

AB - We consider the regression model yt = ηt + εt, t = 0, 1, …, n − 1, where yt are scalar observations, ηt is the unknown regression function, and εt are unobservable errors generated by a zero-mean weakly stationary process, independent of ηt and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of ηt. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal ηt to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of ηt at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.

KW - Cross-validation

KW - Model selection

KW - Nonparametric regression

UR - http://www.scopus.com/inward/record.url?scp=0010097359&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0010097359&partnerID=8YFLogxK

U2 - 10.1080/01621459.1990.10474931

DO - 10.1080/01621459.1990.10474931

M3 - Article

AN - SCOPUS:0010097359

SN - 0162-1459

VL - 85

SP - 705

EP - 714

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 411

ER -

A frequency domain selection criterion for regression with autocorrelated errors

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this