Machine learning aims to design automatic methods which can make intelligent predictions or useful decisions based on historical data. This paradigm of learning from data is playing an increasingly important role in science and engineering. It has highly successful applications in many areas such as natural language processing, speech recognition, computer vision, and biomedical research. At the same time, there has been an increased emphasis on understanding mathematical underpinnings of learning algorithms. The theoretical developments in this direction have had a significant influence on research fields such as statistics, numerical optimization, and other areas of applied mathematics.

Learning theory is generally referred to as a research area which investigates theoretical aspects of learning algorithms. It mainly examines theoretical questions such as “what guarantees can we prove on the performance of learning algorithms?”, “what can we say about the difficulty of learning problems?”, and “how does the generalization performance of learning algorithms depend on the number of parameters?”

In addressing the above and related questions, the field of learning theory has drawn ideas and tools from mathematics and has proven itself extremely useful in not only providing theoretical understanding of learning algorithms but also designing new practical algorithms.

The purpose of this special issue is to summarize the recent progress in the important area of learning theory and, at the same time, provide a forum for applied mathematicians, statisticians, and practical machine learning researchers to exchange their research experience and new ideas in this field.

There are eleven interesting papers in this special issue which cover various aspects of learning theory.

In particular, the work by D.-H. Xiang studied the problem of quantile regression under the framework of empirical risk minimization (ERM) and provided its error analysis by means of a variance-expectation bound. The study by J. Cai investigated a coefficient-based least squares regression problem with indefinite kernels from nonidentical unbounded sampling processes, which extends the existing related results. W. Gao and T. Xu presented their study on the generalization analysis of k-partite ranking algorithm used for ontology computation. They derived generalization bounds using the concept of algorithmic stability. This study is related to the work by H. Chen and J. Wu, where they studied the ranking problem with L1-regularization associated with a convex loss. P. Ye and Y. Han investigated error analysis of the algorithm of truncating the multidimensional Shannon sampling series via localized sampling and obtained uniform bounds of aliasing and truncation errors for functions from anisotropic Besov classes. The work by Q. Wu et al. studied a nonlinear dimension reduction method called Kernel sliced inverse regression. Two types of regularization were proposed to address its computational stability and generalization performance. An interpretation of the proposed algorithms and their consistency analysis were nicely established. D.-X. Zhou investigated the density problem and approximation error of reproducing kernel Hilbert spaces for the purpose of learning theory. The paper by R. Li and Y. Liu considered the density estimation problem and provided an optimal risk upper-bound for a linear wavelet estimator. Multitask learning learns a problem together with other related problems at the same time, using a shared representation, which often benefits the main task. Y.-L. Xu et al. explored a least squares regularized regression algorithm for multitask learning and obtained an upper-bound for the sample error of related tasks. The work by B. Sheng and P. Ye investigated the convergence behavior of regularized regression based on reproducing kernel Banach spaces. The learning rates were obtained in terms of covering numbers and K-functionals. G. You studied the Hermite-Fejer interpolation operator which has potential applications in analysing learning algorithms. Convergence rates are obtained for approximating continuous functions.

The papers in the present issue have addressed recent trends and developments in learning theory, from which we can clearly see close interactions with many fields in mathematics such as approximation theory, statistics, probability theory, functional analysis, and harmonic analysis. We expect that the present special issue would be helpful for researchers to explore new arising learning problems and algorithms.

Acknowledgments

The guest editors wish to express their sincere gratitude to the authors and reviewers who contributed greatly to the success of this special issue.

Ding-Xuan Zhou
Qiang Wu
Yiming Ying