10.07 Image Features

Feature extraction is likely the most important, time consuming and nerve eating activity in a machine learning pipeline. Given enough data (millions of samples) we can perform feature extraction automatically, yet that isn't viable in most cases. For different types of data specialized feature extraction techniques exist. Images probably have the most extensive number of feature building techniques.



Working with images we have been using pixel values as input to our models. That is one way of doing things but it isn't the most effective way in practice. Years of computer vision research did produce techniques for feature extraction that easily outperform any form of PCA or fold learning.

In the Python world the scikit-image is the framework for image manipulation using NumPy arrays.
And it is integrated with matplotlib too, we import these things fro now.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

scikit-image, imported as skimage, has several test images, all are just NumPy arrays.

In [2]:
from skimage import data

camera = data.camera()
(512, 512)

A two dimensional NumPy array is a gray-scale image.

In [3]:
fit, ax = plt.subplots(figsize=(10, 10))
ax.imshow(camera, cmap='gray');

A color image is a three dimensional image where the last (rightmost) dimension are the color channels (most often but not always RGB). skimage tries to build standards for image representations as arrays, their standards are:

image coordinates
2D gray-scale (row, column)
2D color image (row, column, channel)
3D gray-scale (e.g. video) (frame, row, column)
3D color (e.g. video) (frame, row, column, channel)
In [4]:
fit, ax = plt.subplots(figsize=(10, 10))
coffee = data.coffee()
(400, 600, 3)