More efficient approximation of smoothing splines via space-filling basis selection<br>We consider the problem of approximating smoothing spline estimators in a nonparametric regression model. When applied to a sample of size n, the smoothing spline estimator can be expressed as a linear combination of n basis functions, requiring O(n3) computational time when the number d of predictors is two or more. Such a sizeable computational cost hinders the broad applicability of smoothing splines. In practice, the full-sample smoothing spline estimator can be approximated by an estimator based on q randomly selected basis functions, resulting in a computational cost of O(nq2). It is known that these two estimators converge at the same rate when q is of order O{n2/(pr+1)}, where p ∈ [1, 2] depends on the true function and r > 1 depends<br>on the type of spline. Such a q is called the essential number of basis functions. In this article, we develop a more efficient basis selection method. By selecting basis functions corresponding to approximately equally spaced observations, the proposed method chooses a set of basis functions with great diversity. The asymptotic analysis shows that the proposed smoothing spline estimator can decrease q to around O{n1/(pr+1)} when d pr + 1. Applications to synthetic and real-world datasets show that the proposed method leads to a smaller prediction error than other basis selection methods. ...
正在翻譯中..