Project III (MATH3382) 2019-20


Clustering methods for Image Segmentation

Dr J. Einbeck

Description

Finding clusters in data sets is an old statistical problem. One is given a multivariate data cloud, and one is interested in identifying certain subgroups, regions, or clusters so that the data "within are more similar than between". Sometimes this cluster structure is directly visible from the data, but often this will not be the case. In either case, one needs appropriate statistical techniques to identify these clusters. A cluster is typically defined by (i) a cluster center and (ii) and allocation of neighboring data points to this cluster center. The number of clusters may be known or unknown, and inferring the numbers of clusters from the data poses additional challenges. Clustering is an unsupervised statistical learning technique, which is not to be confused with classification, where the clusters are already known, and the question is only how to allocate new data points to clusters.

One very important application of clustering techniques is image segmentation, which is in turn relevant for problems such as number plate recognition or autonomous driving. In image segmentation, clustering is applied on the `feature space' of an image, consisting, for instance, of the 3D RGB colour values or the greyscale values, plus (possibly) spatial coordinates of the image. For instance, the sequence of images to the right shows, from top to bottom, the Cliffs of Moher, a snapshot during the clustering process, the finished clustering, and the resulting segmented image, where each original pixel is given the color shade to which it is allocated according to the clustering process. This is just the most elementary way of approaching this problem; but multiple sophistications of this process exist. Within the scope of the project, you may experiment with a range of techniques ranging from simple manual implementations of the type above up to automated machine learning tools.

The goals of the project is hence twofold; firstly to gain a solid understanding of one or more clustering techniques, and secondly, to acquire the ability to apply these techniques on the segmentation of images (or related problems such as texture classification). You can, of course, use images of your choice for this purpose (bearing copyright restrictions in mind).

Prerequisites

  • Statistical Concepts II

Resources

HTML5 Icon HTML5 Icon HTML5 Icon HTML5 Icon

email: jochen.einbeck "at" durham.ac.uk