SVMs are good for image recognition because of the high dimensional space, i.e. one dimension per pixel in the image. We will attempt to classify a dataset of faces. A face is a non-linear problem. For example we can tell that these two characters are related because the same actor did play them. But the only relation is the face, and perhaps the overly muscular body, the characters themselves have very little in common.
We import the usual things and the SVM classifier. Also we import the loader for the Olivetti faces. The Olivetti faces is a set of $400$ images of faces, $10$ faces per person. It is a very clean dataset: the faces are well centered and the support of each class is the same across all classes. And since we are working with images we import PCA for noise reductions and model selection tools.
import numpy as np import matplotlib.pyplot as plt %matplotlib inline plt.style.use('seaborn-talk') from sklearn.svm import SVC from sklearn.decomposition import PCA from sklearn.pipeline import make_pipeline from sklearn.model_selection import KFold, GridSearchCV from sklearn.datasets import fetch_olivetti_faces ofaces = fetch_olivetti_faces() print(ofaces.DESCR)
.. _olivetti_faces_dataset: The Olivetti faces dataset -------------------------- `This dataset contains a set of face images`_ taken between April 1992 and April 1994 at AT&T Laboratories Cambridge. The :func:`sklearn.datasets.fetch_olivetti_faces` function is the data fetching / caching function that downloads the data archive from AT&T. .. _This dataset contains a set of face images: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html As described on the original website: There are ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses). All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement). **Data Set Characteristics:** ================= ===================== Classes 40 Samples total 400 Dimensionality 4096 Features real, between 0 and 1 ================= ===================== The image is quantized to 256 grey levels and stored as unsigned 8-bit integers; the loader will convert these to floating point values on the interval [0, 1], which are easier to work with for many algorithms. The "target" for this database is an integer from 0 to 39 indicating the identity of the person pictured; however, with only 10 examples per class, this relatively small dataset is more interesting from an unsupervised or semi-supervised perspective. The original dataset consisted of 92 x 112, while the version available here consists of 64x64 images. When using these images, please give credit to AT&T Laboratories Cambridge.
We should also look at the images to see what we are working with.
fig, axes = plt.subplots(10, 10, figsize=(16, 16)) fig.subplots_adjust(hspace=0, wspace=0) for i, ax in enumerate(axes.flat): ax.imshow(ofaces.images[i*2], cmap='gray') ax.axis('off')