Variational problems in machine learning and their solution with finite elements
DOI:
https://doi.org/10.21914/anziamj.v48i0.90Abstract
Many machine learning problems deal with the estimation of conditional probabilities $p(y \mid x)$ from data $(x_1,y_i),\ldots,(x_n,y_n)$. This includes classification, regression and density estimation. Given a prior for $p(y \mid x)$ the maximum a-posteriori method estimates $p(y \mid x)$ as the most likely probability given the data. This principle can be formulated rigorously using the Cameron-Martin theory of stochastic processes and allows a variational characterisation of the estimator. The resulting nonlinear Galerkin equations are solved numerically. Convexity and total positivity lead to existence, uniqueness and error bounds. For machine learning problems dealing with large numbers of features we suggest to use sparse grid approximations.Published
2007-08-16
Issue
Section
Proceedings Computational Techniques and Applications Conference