Introduction to Deep Learning

We will start with a simple example to introduce Deep Learning

  • Input (Test & Train)
  • Output
  • Model - Architecture & Layers & Weights & Activation
  • Loss Function
  • Optimization

Learn a Noisy Function

Saddle Function => $$ Z = 2*X*X - 3*Y*Y + 5 + e$$

In [7]:
import numpy as np
import pandas as pd
import keras
import matplotlib.pyplot as plt # Visualisation Library
In [8]:
%matplotlib inline
In [9]:
x = np.arange(-1, 1, 0.01)
y = np.arange(-1, 1, 0.01)
In [11]:
X,Y = np.meshgrid(x,y)
c = np.ones([200, 200])
e = np.random.rand(200, 200)*0.1
In [12]:
Z = 2 *X*X - 3*Y*Y + 5*c + e
In [13]:
import sys
sys.path.append("../")
In [20]:
from mpl_toolkits.mplot3d import Axes3D
In [21]:
def plot3d(X,Y,Z):
    fig = plt.figure(figsize=(8,8))
    ax = fig.add_subplot(111, projection='3d')
    ax.plot_surface(X, Y, Z, color='y')
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')
    plt.show()
In [23]:
plot3d(X, Y, Z)
  • Input => X, Y
  • Output => Z

Using Keras to learn a "Non-Linear" Function

Step 0: Load the Keras Model

In [24]:
from keras.models import Sequential
from keras.layers import Dense

Step 1: Create the input and output data

In [30]:
input_xy = np.c_[X.reshape(-1), Y.reshape(-1)]  ### X input in ML
output_z = Z.reshape(-1)                        ### y output in ML
In [31]:
input_xy.shape, output_z.shape
Out[31]:
((40000, 2), (40000,))

Step 2: Create the Model (Tranformation + Regression)

In [132]:
model = Sequential()
model.add(Dense(4, input_dim=2, activation="linear"))
model.add(Dense(2, input_dim=2, activation="linear"))
model.add(Dense(1))
In [133]:
model.summary()
Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_29 (Dense)             (None, 4)                 12        
_________________________________________________________________
dense_30 (Dense)             (None, 2)                 10        
_________________________________________________________________
dense_31 (Dense)             (None, 1)                 3         
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________
In [134]:
from keras.utils import plot_model
In [135]:
plot_model(model, show_layer_names=True, show_shapes=True)
Out[135]:

Step 3: Compile the Model: Loss, Optimizer, Fit the Model

In [136]:
model.compile(loss="mean_squared_error", optimizer="sgd", metrics=["mse"])
In [137]:
%%time
output = model.fit(input_xy, output_z, epochs=10, validation_split=0.2, shuffle=True, verbose=1)
Train on 32000 samples, validate on 8000 samples
Epoch 1/10
32000/32000 [==============================] - 1s 34us/step - loss: 0.7956 - mean_squared_error: 0.7956 - val_loss: 6.2738 - val_mean_squared_error: 6.2738
Epoch 2/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6907 - mean_squared_error: 0.6907 - val_loss: 6.0389 - val_mean_squared_error: 6.0389
Epoch 3/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6913 - mean_squared_error: 0.6913 - val_loss: 6.7260 - val_mean_squared_error: 6.7260
Epoch 4/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6895 - mean_squared_error: 0.6895 - val_loss: 6.3373 - val_mean_squared_error: 6.3373
Epoch 5/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6909 - mean_squared_error: 0.6909 - val_loss: 6.1773 - val_mean_squared_error: 6.1773
Epoch 6/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6896 - mean_squared_error: 0.6896 - val_loss: 6.3050 - val_mean_squared_error: 6.3050
Epoch 7/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6904 - mean_squared_error: 0.6904 - val_loss: 6.9137 - val_mean_squared_error: 6.9137
Epoch 8/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6903 - mean_squared_error: 0.6903 - val_loss: 6.9827 - val_mean_squared_error: 6.9827
Epoch 9/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6896 - mean_squared_error: 0.6896 - val_loss: 6.7918 - val_mean_squared_error: 6.7918
Epoch 10/10
32000/32000 [==============================] - 1s 26us/step - loss: 0.6898 - mean_squared_error: 0.6898 - val_loss: 6.3870 - val_mean_squared_error: 6.3870
CPU times: user 11.7 s, sys: 992 ms, total: 12.7 s
Wall time: 8.66 s
In [138]:
output_df = pd.DataFrame(output.history)
output_df.head()
Out[138]:
val_loss val_mean_squared_error loss mean_squared_error
0 6.273764 6.273764 0.795559 0.795559
1 6.038878 6.038878 0.690723 0.690723
2 6.725989 6.725989 0.691273 0.691273
3 6.337259 6.337259 0.689513 0.689513
4 6.177315 6.177315 0.690925 0.690925
In [139]:
output_df.plot.line(y=["val_loss", "loss"]);

Step 5:: Make prediction from model

In [140]:
Z_pred = model.predict(input_xy).reshape(200,200)
In [141]:
plot3d(X,Y, Z_pred)

Experiments

  • Change the number of layers
  • Change the number of learning units in a layer
  • Change the activation from "relu" to "linear"
In [126]:
Dense?
Init signature:
Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer='glorot_uniform',
    bias_initializer='zeros',
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs,
)
Docstring:     
Just your regular densely-connected NN layer.

`Dense` implements the operation:
`output = activation(dot(input, kernel) + bias)`
where `activation` is the element-wise activation function
passed as the `activation` argument, `kernel` is a weights matrix
created by the layer, and `bias` is a bias vector created by the layer
(only applicable if `use_bias` is `True`).

Note: if the input to the layer has a rank greater than 2, then
it is flattened prior to the initial dot product with `kernel`.

# Example

```python
    # as first layer in a sequential model:
    model = Sequential()
    model.add(Dense(32, input_shape=(16,)))
    # now the model will take as input arrays of shape (*, 16)
    # and output arrays of shape (*, 32)

    # after the first layer, you don't need to specify
    # the size of the input anymore:
    model.add(Dense(32))
```

# Arguments
    units: Positive integer, dimensionality of the output space.
    activation: Activation function to use
        (see [activations](../activations.md)).
        If you don't specify anything, no activation is applied
        (ie. "linear" activation: `a(x) = x`).
    use_bias: Boolean, whether the layer uses a bias vector.
    kernel_initializer: Initializer for the `kernel` weights matrix
        (see [initializers](../initializers.md)).
    bias_initializer: Initializer for the bias vector
        (see [initializers](../initializers.md)).
    kernel_regularizer: Regularizer function applied to
        the `kernel` weights matrix
        (see [regularizer](../regularizers.md)).
    bias_regularizer: Regularizer function applied to the bias vector
        (see [regularizer](../regularizers.md)).
    activity_regularizer: Regularizer function applied to
        the output of the layer (its "activation").
        (see [regularizer](../regularizers.md)).
    kernel_constraint: Constraint function applied to
        the `kernel` weights matrix
        (see [constraints](../constraints.md)).
    bias_constraint: Constraint function applied to the bias vector
        (see [constraints](../constraints.md)).

# Input shape
    nD tensor with shape: `(batch_size, ..., input_dim)`.
    The most common situation would be
    a 2D input with shape `(batch_size, input_dim)`.

# Output shape
    nD tensor with shape: `(batch_size, ..., units)`.
    For instance, for a 2D input with shape `(batch_size, input_dim)`,
    the output would have shape `(batch_size, units)`.
File:           /usr/local/lib/python3.6/dist-packages/keras/layers/core.py
Type:           type
Subclasses:     
In [ ]: