Project IV: Face Recognition

Feynman analog black hole

Topic description

While the human brain is able to detect and even recognise faces within fractions of a second, teaching a computer to do the same thing is an altogether different matter.

A wide variety of techniques and mathematical ideas for face recognition has been developed over the years. Some of these have meanwhile found application in consumer electronics (see for instance the face detectors found in many digital cameras). However, many problems remain, and a clear "best" method has not yet surfaced.

In this project you will study one or more of the mathematical techniques underlying face recognition algorithms, and get hands-on experience with them.


[pdf] Project IV: Face Recognition
Startup notes for 2011/2012

Meeting schedule

All meetings take place in my office CM311 unless otherwise indicated.
October 1412:00-13:00Introduction

Computing tools

Python Scientific Tutorial.


Papers in boldface are recommended reading at this point. Those in boldface red are the basic ones to read for PCA approaches to the problem. However, all of them are worth reading, both to get an idea of the different techniques out there, and to learn in more detail about a particular approach.

Survey and general papers

Face Recognition: Features vs. Templates. Early paper that compares approaches that use a number of 'features' inferred from the image (typically intended to be properties of the face in the image and therefore invariant to certain changes in the environment) and those that just use the image values themselves.

Why is Real-World Visual Object Recognition Hard? A general paper discussing the inadequacy of much experimental evaluation in computer vision.

Face recognition: a literature survey. The title says it all. 67 pages from 1993, but very useful.

Face recognition across pose: a review. Reviews those methods that explicitly try to deal with the nuisance variable 'pose', i.e. the position and orientation of the face with respect to the camera.

PCA, Eigenfaces

These methods mostly work by characterizing the parts of image space that correspond to faces via low-dimensional (affine) subspaces of image space.

Eigenfaces for recognition (shorter conference version). Classic paper, one of the first to treat the problem at the image level. Tries to describe 'face' images by an adapted and reduced set of values, projections onto principal components.

Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. Aims at discrimination rather than characterizing faces as a whole. Uses Fisher's linear discriminant (a classical statistical projection tool, which looks at within and between class variance as well as overall variance), as well as results on the effects of illumination changes.

Eigenspace-Based Face Recognition: A Comparative Study of Different Approaches. It's in the title. Contains a nice summary of the way these methods work.

Beyond Eigenfaces: Probabilistic Matching for Face Recognition. Describes itself as 'Bayesian', and it is in a way, but it is in this section because it also bears a close relation to eigenface methods through its use of Gaussian distributions (which implicitly lie behind any use of PCA).

Face Recognition Using Optimal Linear Components of Range Images. This could equally go under the 3d data heading, since it uses range images. However, it treats them as 2d images and performs a dimension reduction procedure related to PCA, but somewhat more sophisticated, which is why it is here. It uses simulated annealing to find the 'optimal'(in a sense defined in the paper) subspace.

Linear Regression for Face Recognition. This paper applies linear regression techniques to try to identify the linear subspace containing the face images of a given person.


In the words of Wikipedia: "Boosting is a machine learning meta-algorithm for performing supervised learning. Boosting is based on the question posed by Kearns: can a set of weak learners create a single strong learner?"

Robust Real-Time Face Detection. A (justifiably?) very famous paper. Introduced a number of innovations, including the use of the boosting algorithm AdaBoost, and an unrelenting focus on speed.

A decision-theoretic generalization of online learningand an application to boosting. The original AdaBoost paper. Not specific to faces.

Boosting algorithms as gradient descent (longer version). Introduces AnyBoost, and shows just how simple boosting algorithms really are behind the curtain. Neat paper.

Neural networks

Neural networks are a famous way of representing and learning functions.

Neural Network-Based Face Detection. Straightforward paper that compares various ways to implement neural network methods.

3d data and geometry

Faces can be treated as surfaces in 3d. Even with optical image data (a 'photo'), one can base a classifier on an underlying 3d model, as this may be a convient way of describing the relevant parts of 'image space'. With data that gives explicit 3d information (for example, range data or a stereo pair of images), a 3d model becomes nearly essential. The papers in this section use this kind of data, and analyse it with tools from differential geometry and statistics shape analysis.

Automatic 3D Face Recognition Using Shapes of Facial Curves. Represents faces by collections of curves in 3d, and then compares them using geodesics on 'shape space', a space of curves modulo similarities and diffeomorphisms.

Three-Dimensional Face Recognition. An approach to expression-invariant face recognition that models changes of expression by isometries of the facial surface. These can be removed by a sophisticated 'normalization', creating a canonical form for each face. Uses disparity maps obtained from stereoscopy.

Expression-invariant representations of faces. Similar to the last one, except it deals with open mouths, i.e. topology changes.


In principle, saying something is Bayesian is really just saying it is rational, so that all techniques can be expressed in a Bayesian language. The adjective is sometimes used, however, to describe approaches that focus on modelling the data, as opposed to trying to construct a classifier directly. Decision theory is then used to generate a classifier from the probabilistic models. Some of the papers below fall into this category, whereas others focus on the more traditional approach of constructing a classifier, but informed by a Bayesian point of view(to a greater or lesser extent).

A unified Bayesian framework for face recognition. This could perhaps also be clasifed under PCA, since it uses PCA as in integral part of the method. It constructs models for the data in the PCA subspace.

Bayesian shape model for facial feature extraction and recognition.Constructs a 'feature'-based model for the face, involving eyes, nose, mouth, etc., and learns prior models for the likely configurations.

An Image-based Bayesian Framework for Face Detection. This paper constructs probabilistic models for face images expressed in terms of their wavelet coefficients. Dependencies between coefficients are modelled using a hidden Markov model.

Bayesian Face Recognition with Deformable Image Models. Treats a face image (or any image) as a surface in 'x-y-I' space, i.e. the graph of the image function, and then builds models of deformations of this surface corresponding to intra- and inter-personal variations.

Active Testing for Face Detection and Localization. This paper treats a different question to the rest. Given any face detector/recognizer, how does one apply it efficiently? Most methods use exhaustive search: they look everywhere in the search space (which may include scale, rotation, and other parameters in addition to position). This paper uses a principled Bayesian technique called 'active testing' to narrow down the search in a way reminiscent of the parlour game 'Twenty Questions'. The method is based on probability theory and entropy, and is related to experimental design.

Assorted recent papers

These recent papers do not fit any of the above categories easily, or we cannot work out in which category they really belong, perhaps because they belong in several. This is normal: the categories above are artificial anyway.

Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features. This paper uses a 'feature-based' approach, but sometimes addresses those features in an 'image-based' way. The problem is to detect, delineate, and identify parts of faces, in images and video.

Useful links

CVOnline, for various notes on computer vision.
Valid XHTML 1.0 Strict