Variational problems in machine learning and their solution with finite elements
AbstractMany machine learning problems deal with the estimation of conditional probabilities $p(y \mid x)$ from data $(x_1,y_i),\ldots,(x_n,y_n)$. This includes classification, regression and density estimation. Given a prior for $p(y \mid x)$ the maximum a-posteriori method estimates $p(y \mid x)$ as the most likely probability given the data. This principle can be formulated rigorously using the Cameron-Martin theory of stochastic processes and allows a variational characterisation of the estimator. The resulting nonlinear Galerkin equations are solved numerically. Convexity and total positivity lead to existence, uniqueness and error bounds. For machine learning problems dealing with large numbers of features we suggest to use sparse grid approximations.
Proceedings Computational Techniques and Applications Conference