Project IV: Face Recognition
Topic description
While the human brain is able to detect and even recognise faces
within fractions of a second, teaching a computer to do the same
thing is an altogether different matter.
A wide variety of techniques and mathematical ideas for face
recognition has been developed over the years. Some of these have
meanwhile found application in consumer electronics (see for
instance the face detectors found in many digital cameras). However,
many problems remain, and a clear "best" method has not yet
surfaced.
In this project you will study one or more of the mathematical
techniques underlying face recognition algorithms, and
get hands-on experience with them.
Notes
|
Project IV: Face Recognition
Startup notes for 2011/2012
|
Meeting schedule
All meetings take place in my office
CM311 unless otherwise indicated.
Computing tools
Python
Scientific Tutorial.
Bibliography
Papers in boldface are recommended reading
at this point. Those in boldface red
are the basic ones to read for PCA approaches to the problem.
However, all of them are worth reading, both to get an idea of the
different techniques out there, and to learn in more detail about a
particular approach.
Survey and general papers
Face
Recognition: Features vs. Templates.
Early paper that compares approaches that use a number of 'features'
inferred from the image (typically intended to be properties of the
face in the image and therefore invariant to certain changes in the
environment) and those that just use the image values
themselves.
Why
is Real-World Visual Object Recognition Hard?
A general paper discussing the inadequacy of much experimental
evaluation in computer vision.
Face
recognition: a literature survey.
The title says it all. 67 pages from 1993, but very useful.
Face
recognition across pose: a review. Reviews those methods
that explicitly try to deal with the nuisance variable 'pose',
i.e. the position and orientation of the face with respect
to the camera.
PCA, Eigenfaces
These methods mostly work by characterizing the
parts of image space that correspond to faces via low-dimensional
(affine) subspaces of image space.
Eigenfaces for recognition
(shorter
conference version). Classic paper, one of the first to treat the
problem at the image level. Tries to describe 'face' images by an
adapted and reduced set of values, projections onto principal
components.
Eigenfaces vs. Fisherfaces: Recognition using
class specific linear projection. Aims at
discrimination rather than characterizing faces as a whole. Uses
Fisher's linear discriminant (a classical statistical projection
tool, which looks at within and between class variance as well as
overall variance), as well as results on the effects of illumination
changes.
Eigenspace-Based
Face Recognition: A Comparative Study of Different Approaches.
It's in the title. Contains a nice summary of the way these methods
work.
Beyond
Eigenfaces: Probabilistic Matching for Face Recognition.
Describes itself as 'Bayesian', and it is in a way, but it is in this
section because it also bears a close relation to eigenface methods
through its use of Gaussian distributions (which implicitly lie
behind any use of PCA).
Face
Recognition Using Optimal Linear Components of Range Images. This
could equally go under the 3d data heading, since it uses range
images. However, it treats them as 2d images and performs a dimension
reduction procedure related to PCA, but somewhat more sophisticated,
which is why it is here. It uses simulated annealing to find the
'optimal'(in a sense defined in the paper) subspace.
Linear
Regression for Face Recognition. This paper applies linear
regression techniques to try to identify the linear subspace
containing the face images of a given person.
Boosting
In the words of Wikipedia: "Boosting is a
machine learning meta-algorithm for performing supervised learning.
Boosting is based on the question posed by Kearns: can a set of weak
learners create a single strong learner?"
Robust
Real-Time Face Detection. A
(justifiably?) very famous paper. Introduced a number of innovations,
including the use of the boosting algorithm AdaBoost, and an
unrelenting focus on speed.
A decision-theoretic
generalization of online learningand an application to
boosting. The original AdaBoost paper. Not specific to
faces.
Boosting algorithms as
gradient descent
(longer
version). Introduces AnyBoost, and shows just how simple
boosting algorithms really are behind the curtain. Neat
paper.
Neural networks
Neural networks are a famous way of representing
and learning functions.
Neural
Network-Based Face Detection.
Straightforward paper that compares various ways to implement neural
network methods.
3d data and geometry
Faces can be treated as surfaces in 3d. Even with
optical image data (a 'photo'), one can base a classifier on an
underlying 3d model, as this may be a convient way of describing the
relevant parts of 'image space'. With data that gives explicit 3d
information (for example, range data or a stereo pair of images), a
3d model becomes nearly essential. The papers in this section use
this kind of data, and analyse it with tools from differential
geometry and statistics shape analysis.
Automatic
3D Face Recognition Using Shapes of Facial Curves. Represents
faces by collections of curves in 3d, and then compares them using
geodesics on 'shape space', a space of curves modulo similarities and
diffeomorphisms.
Three-Dimensional
Face Recognition. An approach to expression-invariant face
recognition that models changes of expression by isometries of the
facial surface. These can be removed by a sophisticated
'normalization', creating a canonical form for each face. Uses
disparity maps obtained from stereoscopy.
Expression-invariant
representations of faces. Similar to the last one, except it
deals with open mouths, i.e. topology changes.
Bayesian
In principle, saying something is Bayesian is
really just saying it is rational, so that all techniques can be
expressed in a Bayesian language. The adjective is sometimes used,
however, to describe approaches that focus on modelling the data, as
opposed to trying to construct a classifier directly. Decision theory
is then used to generate a classifier from the probabilistic models.
Some of the papers below fall into this category, whereas others
focus on the more traditional approach of constructing a classifier,
but informed by a Bayesian point of view(to a greater or lesser
extent).
A
unified Bayesian framework for face recognition. This could
perhaps also be clasifed under PCA, since it uses PCA as in integral
part of the method. It constructs models for the data in the PCA
subspace.
Bayesian
shape model for facial feature extraction and recognition.Constructs
a 'feature'-based model for the face, involving eyes, nose, mouth,
etc., and learns prior models for the likely configurations.
An
Image-based Bayesian Framework for Face Detection. This paper
constructs probabilistic models for face images expressed in terms of
their wavelet coefficients. Dependencies between coefficients are
modelled using a hidden Markov model.
Bayesian
Face Recognition with Deformable Image Models. Treats a face
image (or any image) as a surface in 'x-y-I' space,
i.e. the graph of the image function, and then builds
models of deformations of this surface corresponding to intra-
and inter-personal variations.
Active
Testing for Face Detection and Localization. This paper treats a
different question to the rest. Given any face detector/recognizer,
how does one apply it efficiently? Most methods use exhaustive
search: they look everywhere in the search space (which may include
scale, rotation, and other parameters in addition to position). This
paper uses a principled Bayesian technique called 'active testing' to
narrow down the search in a way reminiscent of the parlour game
'Twenty Questions'. The method is based on probability theory and
entropy, and is related to experimental design.
Assorted recent papers
These recent papers do not fit any of the above categories easily, or we
cannot work out in which category they really belong, perhaps
because they belong in several. This is normal: the categories
above are artificial anyway.
Features
versus Context: An Approach for Precise and Detailed Detection and
Delineation of Faces and Facial Features. This paper uses a
'feature-based' approach, but sometimes addresses those features in
an 'image-based' way. The problem is to detect, delineate, and
identify parts of faces, in images and video.
Useful links
CVOnline,
for various notes on computer vision.