Projects IV: 2019-20 (ARM)

Project IV (MATH4072) 2021-22

Machine learning through the lenses of Optimal Transport and PDEs

Alpár R. Mészáros

Description

Machine learning is a fast developing field. This is certainly due to its broad applications ranging from self driving cars, image processing and recognition, to data sciences and others.

Because of its fast progress, this theory many times lacks solid theoretical foundations, thus constantly more and more challenging mathematical problems are arising.

The purpose of this project is to learn about some of the main mathematical tools that help to rigorously investigate some of these questions and to understand the theoretical foundations of some machine learning problems. In a nutshell, such problems often boil down to the study of a non-convex optimisation problem in very high dimensions. In this project, a particular emphasis will be on the theory on nonlinear partial differential equations, non-convex optimisation and the modern theory of optimal mass transportation. These theories provide very powerful tools that help overcoming the curse of dimensionality. This is achieved many times by studying the solutions to some nonlinear PDEs.

The optimal transport problem was first introduced by Gaspard Monge in 1781, who was probably motivated by military applications, slightly before the French Revolution. He raised the question about how to transport in the cheapest possible way a given pile of sand into a given hole. This seemingly naive optimisation problem turned out to be quite challenging from the mathematical viewpoint, and its resolution had to wait until 1942, when Leonid Kantorovich proposed a solution to a relaxed version of the original problem. Kantorovich received the Nobel prize in Economics, partially because of this work.

This problem had its true renaissance in the late 1980s by the works of Yann Brenier, who realised that this problem in underneath many physical phenomena arising in fluid mechanics, meteorology and elsewhere. In the past 30 years, this theory gained a lot of attention and it became an important branch of pure and applied mathematics. Among others, two recent Fields medalists, Cédric Villani in 2010 and Alessio Figalli in 2018 were recognized for their works in optimal transport. Recently, this field had important applications in particular in machine learning, data science and elsewhere.

The first objective of this project is to understand the mathematical basics of the above mentioned theories. As a second objective, we will consider reading various recent papers on the applications of these theories to machine learning. Every participant will write a synthesis on 1-2 such papers, of their choice. Participants with strong computing skills may choose to implement some of the studied algorithms.

Prerequisites

Analysis III/IV and/or PDE III/IV would be essential. If not taken previously, it is recommended to take at least one of these modules parallelly with the project.

Main texts

- F. Santambrogio, Optimal transport for applied mathematicians. Calculus of variations, PDEs, and modeling. Progress in Nonlinear Differential Equations and their Applications, 87. Birkhauser/Springer, Cham, 2015. (A preliminary version of the manuscript is available on the author's webpage here.)

- L. Ambrosio, N. Gigli, A user's guide to optimal transport, lecture notes, online accessible here.

A selection of recent research papers

- S. Mei, A. Montanari, P.-M. Nguyen, A mean field view of the landscape of two-layer neural networks, PNAS, (2018).

- X. Fernández-Real, A. Figalli, The continuous fomrumation of shallow neural networks as Wasserstein-type gradient flows, preprint, (2020).

- A. Javanmard, M. Mondelli, A. Montanari, Machine learning from a continuous viewpoint, arXiv preprint https://arxiv.org/abs/1912.12777, (2019).

- P.-M. Nguyen, H.-T. Pham, A rigorous framework for the mean field limit of multilayer neural networks, arXiv preprint https://arxiv.org/abs/2001.11443, (2020).

- L. Chizat, F. Bach, On the global convergence of gradient descent for overparameterized models using optimal transport, Advances in Neural Information Pro- cessing Systems (NeurIPS), (2018).

- W. E, J. Han, Q. Li, A mean-field optimal control formulation of deep learning, Res Math Sci 6, (2019), 10.

- W. E, C. Ma, L. Wu, Machine learning from a continuous viewpoint, arXiv preprint https://arxiv.org/abs/1912.12777, (2019)

email: Alpár R. Mészáros