We consider the regression model yt = ηt + εt, t = 0, 1, …, n − 1, where yt are scalar observations, ηt is the unknown regression function, and εt are unobservable errors generated by a zero-mean weakly stationary process, independent of ηt and with completely unknown autocorrelation structure. We propose a data-driven method for selecting a parametric or nonparametric estimator of ηt. The method is based on cross-validation in the frequency domain and requires no assumptions about the form of the estimator or the error correlations. It does, however, require the discrete Fourier transform (DFT) of the signal ηt to be a smooth complex function of frequency, as is the case, for example, with transient signals or growth and decay curves. After giving some general motivations for the method, we focus on the special case of linear estimators of a nonparametric regression function, including both kernel and spline estimators. For these estimators, we develop efficient methods of evaluating the frequency domain cross-validation (FDCV) function. The standard time domain cross-validation (TDCV) method, which leaves out data points one at a time, is sensible only when the errors are independent. Autocorrelation among the errors can cause severe biases in the TDCV function, leading to poor selections. FDCV leaves out discrete Fourier transform values one at a time. These values are approximately independent regardless of the error correlation structure, and hence FDCV remains valid even for correlated errors, as long as the DFT of ηt at the omitted frequency can be predicted from those remaining. Asymptotic properties of FDCV are given for a class of transient signals. Then the usefulness of FDCV for transient and other signals is demonstrated in a Monte Carlo study comparing the performances of TDCV and FDCV for selecting a kernel estimator of a nonparametric regression function. The use of FDCV is illustrated with data on international airline travel.
- Model selection
- Nonparametric regression
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty