Project IV (MATH4072) 2018-19


Nonlinear approaches to dimension reduction

Dr J. Einbeck and Dr L. Aslett

Description

High-dimensional data are ubiquitous nowadays, and the problems of finding structures or lower-dimensional approximations of the original data has become an indispensable activity in many branches of industry, business and science. Data analysts who carry out such tasks can make use of a well-equipped toolbox of methods, ranging from relatively simple `linear approximation' techniques such as Principal Component Analysis (PCA) to elaborated concepts in machine learning such as Artificial Neural Networks. In the middle ground between these two extremes, and actually intersecting with both of them, one finds a group of nonlinear dimension reduction techniques, which try to overcome the obvious limitation of PCA (which can only identify `best linear approximations') by adequately localizing or transforming the usual PCA machinery. In a nutshell, the latter family of approaches apply nonlinear transformations on the data before they enter the covariance matrix, while the former employ localized covariance matrices but then need additional algorithmic steps to connect the indidividual localized PCA segments. For instance, the below shows a 2D `principal manifold' obtained from a 3D data set taken from a from marine science application.
2d principal manfold

This image was constructed using a localized approach to PCA. Localized approaches are somewhat more flexible than transformation-based approaches, but the latter allow in a more principled way for mathematical analysis, and are better suited for higher dimensions. In this project, you will investigate some nonlinear dimension reduction techniques according to your interests, understand their properties, and apply them onto some adequate applications which may for instance come from physics, ecology, or engineering.

Co/Prerequisites

  • Statistical Methods III
  • Topics in Statistics III/IV is useful but not necessary

Resources

  1. Gorban, A., Kégl, B., Wunsch, D. and Zinovyev, A (2008). Principal Manifolds for Data Visualization and Dimension Reduction, Springer.
  2. Bishop, C. (2006). Pattern Recognition and Machine Learning, Springer.
  3. Hastie, R., Tibshirani, R. and Friedman, J. The Elements of Statistical Learning, Springer.
  4. Kung, Y.S. (2014). Kernel methods and machine learning, Cambridge University Press.

email: jochen.einbeck "at" durham.ac.uk