Improved analysis of supervised learning in the RKHS with random features: Beyond least squares.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

We consider kernel-based supervised learning using random Fourier features, focusing on its statistical error bounds and generalization properties with general loss functions. Beyond the least squares loss, existing results only demonstrate worst-case analysis with rate n and the number of features at least comparable to n, and refined-case analysis where it can achieve almost n rate when the kernel's eigenvalue decay is exponential and the number of features is again at least comparable to n. For the least squares loss, the results are much richer and the optimal rates can be achieved under the source and capacity assumptions, with the number of features smaller than n. In this paper, for both losses with Lipschitz derivative and Lipschitz losses, we successfully establish faster rates with number of features much smaller than n, which are the same as the rates and number of features for the least squares loss. More specifically, in the attainable case (the true function is in the RKHS), we obtain the rate n which is the same as the standard method without using approximation, using o(n) features, where ξ characterizes the smoothness of the true function and γ characterizes the decay rate of the eigenvalues of the integral operator. Thus our results answer an important open question regarding random features.

Authors

  • Jiamin Liu
  • Lei Wang
    Department of Nursing, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China.
  • Heng Lian
    City University of Hong Kong Shenzhen Research Institute, Shenzhen, China; Department of Mathematics, City University of Hong Kong, Hong Kong, China. Electronic address: henglian@cityu.edu.hk.