We consider the regression problem of estimating functions on RD but supported on a d-dimensional manifold M ⊂ RD with d D. Drawing ideas from multi-resolution analysis and nonlinear approximation, we construct low-dimensional coordinates on M at multiple scales, and perform multiscale regression by local polynomial fitting. We propose a data-driven wavelet thresholding scheme that automatically adapts to the unknown regularity of the function, allowing for efficient estimation of functions exhibiting nonuniform regularity at different locations and scales. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our estimator attains optimal learning rates (up to logarithmic factors) as if the function was defined on a known Euclidean domain of dimension d, instead of an unknown manifold embedded in RD. The implemented algorithm has quasilinear complexity in the sample size, with constants linear in D and exponential in d. Our work therefore establishes a new framework for regression on low-dimensional sets embedded in high dimensions, with fast implementation and strong theoretical guarantees.
Liao, W., Maggioni, M., Vigogna, S. (2022). Multiscale regression on unknown manifolds. MATHEMATICS IN ENGINEERING, 4(4), 1-25 [10.3934/MINE.2022028].
Multiscale regression on unknown manifolds
Vigogna S.
2022-01-01
Abstract
We consider the regression problem of estimating functions on RD but supported on a d-dimensional manifold M ⊂ RD with d D. Drawing ideas from multi-resolution analysis and nonlinear approximation, we construct low-dimensional coordinates on M at multiple scales, and perform multiscale regression by local polynomial fitting. We propose a data-driven wavelet thresholding scheme that automatically adapts to the unknown regularity of the function, allowing for efficient estimation of functions exhibiting nonuniform regularity at different locations and scales. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our estimator attains optimal learning rates (up to logarithmic factors) as if the function was defined on a known Euclidean domain of dimension d, instead of an unknown manifold embedded in RD. The implemented algorithm has quasilinear complexity in the sample size, with constants linear in D and exponential in d. Our work therefore establishes a new framework for regression on low-dimensional sets embedded in high dimensions, with fast implementation and strong theoretical guarantees.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.