Image Classification

Fashion MNIST

Set of images of Fashion Items

  • Training: 60,000
  • Test: 60,000
  • Class: 10

Classification - Predicting a categorical variables (Buckets, Class, Categories)

Get the Input and Output data

In [1]:
import numpy as np
import pandas as pd
import keras
import matplotlib.pyplot as plt
%matplotlib inline
import vis
Using TensorFlow backend.
In [2]:
from keras.datasets import fashion_mnist
In [4]:
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train.shape, y_train.shape, x_test.shape, y_test.shape
Out[4]:
((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))
In [5]:
labels = vis.fashion_mnist_label()
labels
Out[5]:
{0: 'T-shirt/top',
 1: 'Trouser',
 2: 'Pullover',
 3: 'Dress',
 4: 'Coat',
 5: 'Sandal',
 6: 'Shirt',
 7: 'Sneaker',
 8: 'Bag',
 9: 'Ankle boot'}

See an Image

In [7]:
x_train[0].shape
Out[7]:
(28, 28)
In [8]:
vis.imshow(x_train[0])
Out[8]:
In [14]:
np.unique(y_train)
Out[14]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8)
In [16]:
vis.imshow_unique(x_train, y_train, labels)
In [18]:
vis.imshow_sprite(x_train[:500])

Multi Layer Perceptron - For Image

Step 1: Prepare Input and Output

In [19]:
x_train.shape
Out[19]:
(60000, 28, 28)

Input

  • Normalize the data: 0 - 255 => 0 to 1
In [20]:
x_train = x_train/255
x_test = x_test/255

Output

  • One-Hot Encoding / to_categorical encoding
In [21]:
from keras.utils import to_categorical
In [24]:
y_train_class = to_categorical(y_train, num_classes=10)
y_test_class = to_categorical(y_test, num_classes=10)
In [27]:
y_train_class.shape, y_test_class.shape
Out[27]:
((60000, 10), (10000, 10))
In [28]:
y_train_class[:10]
Out[28]:
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]], dtype=float32)

Step 2: Craft the Model (Feature Transformer + Classifier)

In [31]:
from keras.models import Sequential
from keras.layers import Dense, Flatten
In [39]:
model = Sequential()
model.add(Flatten(input_shape=(28,28)))
model.add(Dense(units = 40, activation="relu"))
model.add(Dense(units=10, activation="softmax"))
In [40]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 40)                31400     
_________________________________________________________________
dense_4 (Dense)              (None, 10)                410       
=================================================================
Total params: 31,810
Trainable params: 31,810
Non-trainable params: 0
_________________________________________________________________

Step 3: Compile the model with loss, optimizer and fit the model

In [41]:
model.compile(loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])
In [42]:
%%time
output = model.fit(x_train, y_train_class, epochs=10, verbose=1, 
                   validation_split=0.2)
Train on 48000 samples, validate on 12000 samples
Epoch 1/10
48000/48000 [==============================] - 2s 46us/step - loss: 0.8071 - acc: 0.7422 - val_loss: 0.5904 - val_acc: 0.7981
Epoch 2/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.5444 - acc: 0.8159 - val_loss: 0.5115 - val_acc: 0.8244
Epoch 3/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4948 - acc: 0.8295 - val_loss: 0.4941 - val_acc: 0.8203
Epoch 4/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4694 - acc: 0.8378 - val_loss: 0.4669 - val_acc: 0.8345
Epoch 5/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4522 - acc: 0.8443 - val_loss: 0.4584 - val_acc: 0.8383
Epoch 6/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4401 - acc: 0.8475 - val_loss: 0.4658 - val_acc: 0.8325
Epoch 7/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4302 - acc: 0.8506 - val_loss: 0.4393 - val_acc: 0.8471
Epoch 8/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4211 - acc: 0.8543 - val_loss: 0.4317 - val_acc: 0.8473
Epoch 9/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4134 - acc: 0.8571 - val_loss: 0.4391 - val_acc: 0.8478
Epoch 10/10
48000/48000 [==============================] - 2s 43us/step - loss: 0.4068 - acc: 0.8575 - val_loss: 0.4271 - val_acc: 0.8524
CPU times: user 36.2 s, sys: 9.56 s, total: 45.8 s
Wall time: 21.1 s

Step 4: Check the performance

In [38]:
vis.metrics(output.history)
Out[38]:

Step 5: Make a Prediction

In [43]:
score = model.evaluate(x_test, y_test_class, verbose = 1)
10000/10000 [==============================] - 0s 20us/step
In [44]:
print("Test Loss", score[0])
print("Test Accuracy", score[1])
Test Loss 0.4524360636949539
Test Accuracy 0.8424
In [47]:
predict_classes = model.predict_classes(x_test)
actual_classes = y_test
In [48]:
pd.crosstab(actual_classes, predict_classes)
Out[48]:
col_0 0 1 2 3 4 5 6 7 8 9
row_0
0 843 1 13 38 8 1 82 1 13 0
1 5 952 6 27 7 0 2 0 1 0
2 22 2 700 7 193 2 68 0 6 0
3 41 8 19 840 57 0 31 0 4 0
4 0 0 66 22 859 1 47 0 5 0
5 0 0 0 1 0 916 0 52 4 27
6 173 1 110 31 168 0 493 0 24 0
7 0 0 0 0 0 30 0 932 0 38
8 5 1 14 5 6 2 15 6 946 0
9 0 0 0 0 0 12 1 43 1 943
In [50]:
# Probabilities
probs = model.predict_proba(x_test)
In [51]:
i = 4
In [54]:
vis.imshow(x_test[i], labels[y_test[i]]) | vis.predict(probs[i], y_test[i], labels)
Out[54]:

Convolution Neural Network

Step 1: Prepare our Input and Output

In [74]:
x_train.shape
Out[74]:
(60000, 28, 28)
In [58]:
x_train_conv = x_train.reshape(60000, 28, 28, 1)
x_test_conv = x_test.reshape(10000, 28, 28, 1)
In [59]:
x_train.shape, y_train.shape, x_test.shape, y_test.shape
Out[59]:
((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))

Step 2: Simple Convolution Network

In [75]:
Conv2D??
Init signature: Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None, **kwargs)
Source:        
class Conv2D(_Conv):
    """2D convolution layer (e.g. spatial convolution over images).

    This layer creates a convolution kernel that is convolved
    with the layer input to produce a tensor of
    outputs. If `use_bias` is True,
    a bias vector is created and added to the outputs. Finally, if
    `activation` is not `None`, it is applied to the outputs as well.

    When using this layer as the first layer in a model,
    provide the keyword argument `input_shape`
    (tuple of integers, does not include the sample axis),
    e.g. `input_shape=(128, 128, 3)` for 128x128 RGB pictures
    in `data_format="channels_last"`.

    # Arguments
        filters: Integer, the dimensionality of the output space
            (i.e. the number of output filters in the convolution).
        kernel_size: An integer or tuple/list of 2 integers, specifying the
            height and width of the 2D convolution window.
            Can be a single integer to specify the same value for
            all spatial dimensions.
        strides: An integer or tuple/list of 2 integers,
            specifying the strides of the convolution
            along the height and width.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Specifying any stride value != 1 is incompatible with specifying
            any `dilation_rate` value != 1.
        padding: one of `"valid"` or `"same"` (case-insensitive).
            Note that `"same"` is slightly inconsistent across backends with
            `strides` != 1, as described
            [here](https://github.com/keras-team/keras/pull/9473#issuecomment-372166860)
        data_format: A string,
            one of `"channels_last"` or `"channels_first"`.
            The ordering of the dimensions in the inputs.
            `"channels_last"` corresponds to inputs with shape
            `(batch, height, width, channels)` while `"channels_first"`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".
        dilation_rate: an integer or tuple/list of 2 integers, specifying
            the dilation rate to use for dilated convolution.
            Can be a single integer to specify the same value for
            all spatial dimensions.
            Currently, specifying any `dilation_rate` value != 1 is
            incompatible with specifying any stride value != 1.
        activation: Activation function to use
            (see [activations](../activations.md)).
            If you don't specify anything, no activation is applied
            (ie. "linear" activation: `a(x) = x`).
        use_bias: Boolean, whether the layer uses a bias vector.
        kernel_initializer: Initializer for the `kernel` weights matrix
            (see [initializers](../initializers.md)).
        bias_initializer: Initializer for the bias vector
            (see [initializers](../initializers.md)).
        kernel_regularizer: Regularizer function applied to
            the `kernel` weights matrix
            (see [regularizer](../regularizers.md)).
        bias_regularizer: Regularizer function applied to the bias vector
            (see [regularizer](../regularizers.md)).
        activity_regularizer: Regularizer function applied to
            the output of the layer (its "activation").
            (see [regularizer](../regularizers.md)).
        kernel_constraint: Constraint function applied to the kernel matrix
            (see [constraints](../constraints.md)).
        bias_constraint: Constraint function applied to the bias vector
            (see [constraints](../constraints.md)).

    # Input shape
        4D tensor with shape:
        `(samples, channels, rows, cols)`
        if `data_format` is `"channels_first"`
        or 4D tensor with shape:
        `(samples, rows, cols, channels)`
        if `data_format` is `"channels_last"`.

    # Output shape
        4D tensor with shape:
        `(samples, filters, new_rows, new_cols)`
        if `data_format` is `"channels_first"`
        or 4D tensor with shape:
        `(samples, new_rows, new_cols, filters)`
        if `data_format` is `"channels_last"`.
        `rows` and `cols` values might have changed due to padding.
    """

    @interfaces.legacy_conv2d_support
    def __init__(self, filters,
                 kernel_size,
                 strides=(1, 1),
                 padding='valid',
                 data_format=None,
                 dilation_rate=(1, 1),
                 activation=None,
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 kernel_regularizer=None,
                 bias_regularizer=None,
                 activity_regularizer=None,
                 kernel_constraint=None,
                 bias_constraint=None,
                 **kwargs):
        super(Conv2D, self).__init__(
            rank=2,
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding=padding,
            data_format=data_format,
            dilation_rate=dilation_rate,
            activation=activation,
            use_bias=use_bias,
            kernel_initializer=kernel_initializer,
            bias_initializer=bias_initializer,
            kernel_regularizer=kernel_regularizer,
            bias_regularizer=bias_regularizer,
            activity_regularizer=activity_regularizer,
            kernel_constraint=kernel_constraint,
            bias_constraint=bias_constraint,
            **kwargs)
        self.input_spec = InputSpec(ndim=4)

    def get_config(self):
        config = super(Conv2D, self).get_config()
        config.pop('rank')
        return config
File:           /opt/conda/lib/python3.6/site-packages/keras/layers/convolutional.py
Type:           type
In [61]:
from keras.layers import Conv2D, MaxPooling2D
In [67]:
cnn = Sequential()
cnn.add(Conv2D(filters = 30, kernel_size=(3,3), activation="relu", input_shape=(28,28,1)))
cnn.add(MaxPooling2D(pool_size=(3,3)))
cnn.add(Flatten())
cnn.add(Dense(40, activation="relu"))
cnn.add(Dense(10, activation="softmax"))
In [68]:
cnn.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, 26, 26, 30)        300       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 8, 8, 30)          0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 1920)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 40)                76840     
_________________________________________________________________
dense_10 (Dense)             (None, 10)                410       
=================================================================
Total params: 77,550
Trainable params: 77,550
Non-trainable params: 0
_________________________________________________________________

Step 3: Compile & Fit the Model

In [70]:
cnn.compile(loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])
In [71]:
%%time
output_cnn = cnn.fit(x_train_conv, y_train_class, epochs=5, validation_split=0.2, verbose=1)
Train on 48000 samples, validate on 12000 samples
Epoch 1/5
48000/48000 [==============================] - 20s 427us/step - loss: 0.9089 - acc: 0.6859 - val_loss: 0.7040 - val_acc: 0.7147
Epoch 2/5
48000/48000 [==============================] - 21s 429us/step - loss: 0.5839 - acc: 0.7830 - val_loss: 0.5445 - val_acc: 0.7992
Epoch 3/5
48000/48000 [==============================] - 21s 428us/step - loss: 0.5253 - acc: 0.8076 - val_loss: 0.4917 - val_acc: 0.8210
Epoch 4/5
48000/48000 [==============================] - 16s 328us/step - loss: 0.4859 - acc: 0.8232 - val_loss: 0.4855 - val_acc: 0.8247
Epoch 5/5
48000/48000 [==============================] - 13s 265us/step - loss: 0.4524 - acc: 0.8364 - val_loss: 0.4328 - val_acc: 0.8431
CPU times: user 3min 24s, sys: 43.1 s, total: 4min 7s
Wall time: 1min 30s

Step 4: Test the Model

In [72]:
vis.metrics(output_cnn.history)
Out[72]: