Variational problems in machine learning and their solution with finite elements

Markus Hegland, Michael Griebel


Many machine learning problems deal with the estimation
of conditional probabilities $p(y \mid x)$ from data
$(x_1,y_i),\ldots,(x_n,y_n)$. This includes classification,
regression and density estimation. Given a prior for
$p(y \mid x)$ the maximum a-posteriori method estimates
$p(y \mid x)$ as the most likely probability given the
data. This principle can be formulated rigorously using the
Cameron-Martin theory of stochastic processes and allows
a variational characterisation of the estimator.
The resulting nonlinear
Galerkin equations are solved numerically. Convexity and
total positivity lead to existence, uniqueness and error
bounds. For machine learning problems dealing with large
numbers of features we suggest to use sparse grid approximations.

Full Text:

PDF BibTeX References


Remember, for most actions you have to record/upload into this online system
and then inform the editor/author via clicking on an email icon or Completion button.
ANZIAM Journal, ISSN 1446-8735, copyright Australian Mathematical Society.