Learning Notes - Introduction to TensorFlow 2.0

MOOCs | 01 June 2019

This blog post contains all my learnings from Google's Laurence Moroney's nice Coursera course named Introduction to TensorFlow for Artificial Intelligence, Machine Learning and Deep Learning.

TensorFlow 2.0

It’s a great news that Google developers have released the alpha version of TensorFlow 2.0 (at the time of writing this post) which now focuses more on usability, clarity and flexibility just like Keras.

In TensorFlow 2.0, we now have tf.keras as the high-level API. What this means is that you can now use Keras inside TensorFlow itself in addition to all those advanced functions that TensorFlow offers. Furthermore, 2.0 has eager execution enabled by default which means you no longer need to create a session and run the computational graph inside that. Everything is dynamic just like PyTorch now.

We could also say that Keras now goes inside TensorFlow which is so cool 😍. We now have an All in one API to do machine learning and deep learning with high-level Keras like API as well as low-level TensorFlow API.

Figure 1. TensorFlow 2.0 (All in one API)

You can check out these links to know more about TensorFlow 2.0 updates.

Week 1 - Hello TensorFlow 2.0

Week 1 introduces you to the “hello world” of deep neural networks. This include how to use TensorFlow 2.0 API to solve a simple linear regression problem. It also taught how machine learning is different from traditional programming as shown in Figure 2.

Figure 2. Traditional Programming vs Machine Learning

Given these two arrays, build a deep neural network to find the relationship between \(Xs \) and \(Ys \).

$$ Xs = [-1.0, 0.0, 1.0, 2.0, 3.0, 4.0] \\ Ys = [-3.0, -1.0, 1.0, 3.0, 5.0, 7.0] $$

And the code used to achieve this is only 9 lines (excluding blank lines and comments) which combines Keras functions inside TensorFlow. How amazing it is to experiment quickly with a deep neural network right away without so much complex programming.

hello_tensorflow_2_0.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import tensorflow as tf
import numpy as np
from tensorflow import keras

# simplest possible neural network with only one neuron
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer="sgd", loss="mean_squared_error")

# define 2 one-dimensional numpy arrays with float values
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

# fit the model on xs and ys
model.fit(xs, ys, epochs=500)

# using the model, predict the answer for an unseen input value
print(model.predict([10.0]))

Additionally, it taught how to use Google’s colaboratory which is an online tool for executing and sharing python code just like Jupyter Notebook but the execution happens in a virtual machine so that it doesn’t matter if you work on a Windows or Mac or Linux. This requires zero setup from your end and runs fully in cloud. You can learn more about Google’s Colaboratory here.

Key concepts

  • Dense() is a layer of connected neurons.
  • Sequential() contains successive layers of neurons.
  • loss measures how good the current "guess" of the deep neural network is.
  • optimizer generates a new and improved "guess" till convergence is reached.
  • convergence is the process of getting very close to the correct answer.
  • model.fit() trains the neural network to fit one set of values to another.
  • epochs is the number of iterations or training loops the model goes through to reach a minimal loss.

Week 2 - Fashion MNIST

Week 2 taught how to implement a deep neural network on computer vision problems. Fashion MNIST dataset was used to train a deep neural network and predictions were made on the testing set.

I became aware that Fashion MNIST dataset has become a replacement to the traditional handwritten digits MNIST dataset.

Figure 3. Fashion MNIST dataset - [github]

In a nutshell, Fashion MNIST dataset contains

  • 60000 training images
  • 10000 testing images
  • Each image is a 28x28 grayscale image
  • 10 classes [T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, Ankle boot]

I learnt how modifying each hyper-parameter of the model affects model’s accuracy, training time and loss. These hyper-parameter strategies include

  • Increasing the number of neurons in a layer.
  • Adding one or more layers of neurons between first and last layer.
  • Increasing or decreasing the value for epochs.
dnn_fashion_mnist.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import tensorflow as tf

# load the fashion MNIST dataset
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

# normalize the images between 0 and 1
training_images  = training_images / 255.0
test_images      = test_images / 255.0

# create the deep neural network (dnn)
# with a flatten layer, 128 neurons in a dense layer and 10 neurons in a dense layer
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

# compile the model using 
# adam optimizer and sparse categorical crossentropy
model.compile(optimizer = tf.train.AdamOptimizer(),
              loss = 'sparse_categorical_crossentropy',
              metrics=['accuracy'])

# fit the model on training images and training labels
model.fit(training_images, training_labels, epochs=5)

# evaluate the model on test images and test labels
model.evaluate(test_images, test_labels)

I always wondered how to stop training the model once it has reached a good accuracy number. And, this week taught me exactly how to implement this by using callbacks in model.fit(). Below code explains how to implement callbacks using a built-in method called on_epoch_end() where you can specify the accuracy value at which the model must stop training even if some amount of epochs are left.

using_callbacks.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import tensorflow as tf

# custom callback to stop dnn training
# this stops model training once it reaches 60% accuracy
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('loss')<0.4):
      print("\nReached 60% accuracy so cancelling training!")
      self.model.stop_training = True

# create your custom callback
callbacks = myCallback()

# load the fashion MNIST dataset
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

# normalize the images between 0 and 1
training_images = training_images/255.0
test_images     = test_images/255.0

# create the deep neural network (dnn)
# with a flatten layer, 512 neurons in a dense layer and 10 neurons in a dense layer
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# compile the model using 
# adam optimizer and sparse categorical crossentropy
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

# fit the model on training images and training labels with your custom callback
model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])

Key concepts

  • Flatten() is used to convert a square image (28x28) into a 1-dimensional vector (1,784).
  • on_epoch_end() is used to stop the model's training process given a certain condition has been met.
  • callbacks parameter in model.fit() takes in a callback object created using tf.keras.callbacks.Callback

Week 3 - Convolutions

Week 3 taught how to use convolutions Conv2D() and pooling MaxPooling2D() to make deep neural networks more intelligent than a traditional multi-layer perceptron.

Instead of flattening an image and sending to your DNN’s first layer (which decreases DNN’s feature learning capabilities), what if you use convolutions and pooling to make the DNN learn more intricate features instead of not flattening the image (thus, maintaining spatial information)?

In addition, the extra blank or unwanted regions in an image isn’t of interest to our deep neural network to predict the class of an image. Thus, using convolutions (as shown in Figure 4), we can scan through the image and keep track of spatial information as much as possible.

Ex: If we have image of size 28x28 and apply convolution with a filter size of 3x3, then the output size of the image will be 26x26.

Figure 4. Convolution

Similar to convolution, max pooling (as shown in Figure 5) is used to take the maximum value from a matrix or filter size (say 2x2) along the image and construct a new reduced size image, but preserving the important features.

Ex: If we have image of size 26x26 and apply max-pooling with a filter size of 2x2, then the output size of the image will be 13x13.

Figure 5. Max Pooling

Below code explains how to construct a Convolutional Neural Network (CNN) with convolution and pooling layers with flatten and dense layers too. Notice that we don’t flatten the input image shape to our first layer, thus preserving spatial information, thereby help CNN in learning more accurate features.

mnist_cnn.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import tensorflow as tf

# custom callback to stop dnn training
# this stops model training once it reaches 99.7% accuracy
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('acc')>0.997):
      print("\nReached 99.7% accuracy so cancelling training!")
      self.model.stop_training = True

# create your custom callback
callbacks = myCallback()

# load the handwritten digits MNIST dataset
mnist = tf.keras.datasets.mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

# normalize the images between 0 and 1
# resize the images based on 28x28 grayscale value
training_images = training_images.reshape(60000, 28, 28, 1)
training_images = training_images / 255.0
test_images     = test_images.reshape(10000, 28, 28, 1)
test_images     = test_images/255.0

# create the convolutional neural network (cnn) with
#  * a convolution layer of 64 neurons and 3x3 filter
#  * a max pooling layer of 2x2 filter
#  * a flatten layer
#  * a dense layer of 128 neurons
#  * a dense layer of 10 neurons
model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layer.Conv2D(64, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

# compile the model with adam optimizer and sparse categorical cross entropy
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# fit the model with training images and labels with your custom callback
model.fit(training_images, training_labels, epochs=10, callbacks=[callbacks])

Key concepts

  • Conv2D() is used to apply convolution with a filter size parameter.
  • MaxPooling2D() is used to apply max-pooling with a filter size parameter.
  • Applying convolutions on top of a deep neural network will make training faster or slower depending on many factors (hyper-parameters).
  • relu activation means when an input value (x) enters a neuron, throw away if its a negative value (y=0) and only keep positive value (y=x).
  • model.summary() is used to get a nice visual summary of the model's layer (type), input shape and number of parameters.

Week 4 - Image Generators

Week 4 taught exactly how to work with real-world images which are not as pretty cropped or center fixed subject or in smaller size (28x28) just like the ones we saw in benchmark datasets such as Fashion MNIST.

I learnt how to use ImageDataGenerator() which is a really cool feature for augmenting images. You can point a directory to the ImageDataGenerator() and it automatically generates images inside sub-directories with labels for you as shown in code below.

image_generator.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# training set
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
          train_dir,
          target_size=(300,300),
          batch_size=128,
          class_mode="binary")

# validation set
test_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = test_datagen.flow_from_directory(
          validation_dir,
          target_size=(300,300),
          batch_size=128,
          class_mode="binary")

# training is different with fit_generator()
history = model.fit_generator(
      train_generator,
      steps_per_epoch=8,
      epochs=15,
      validation_data=validation_generator,
      validation_steps=8,
      verbose=2)

When we do deep learning, we always check for ways to test our model with real-world unseen images. I learnt how to use google.colab to upload files and then make predictions using the model that I have trained in colab. Below code explains this process.

test.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()

for fn in uploaded.keys():
  # predicting images
  path = "/content" + fn
  img  = image.load_img(path, target_size=(300, 300))
  x    = image.img_to_array(img)
  x    = np.expand_dims(x, axis=0)

  images  = np.vstack([x])
  classes = model.predict(images, batch_size=10)
  print(classes[0])
  if classes[0]>0.5:
    print(fn + " is a human")
  else:
    print(fn + " is a horse") 

Here is a full-fledged code sample that I learnt which fetches data into colab, uses image generator to augment the training dataset, creates and trains a convolutional neural network to classify happy or sad images.

happy_or_sad.pycode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# organize imports
import tensorflow as tf
import os
import zipfile
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# stop learning once the model reaches 99.9% accuracy
DESIRED_ACCURACY = 0.999

# fetch data from a zip file and load it into the virtual machine
!wget --no-check-certificate \
    "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/happy-or-sad.zip" \
    -O "/tmp/happy-or-sad.zip"

# extract the contents from the zipfile
zip_ref = zipfile.ZipFile("/tmp/happy-or-sad.zip", 'r')
zip_ref.extractall("/tmp/h-or-s")
zip_ref.close()

# create a custom callback to stop training when desired accuracy is reached
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('acc')>DESIRED_ACCURACY):
      print("\nReached " + str(DESIRED_ACCURACY * 100) + " accuracy so cancelling training!")
      self.model.stop_training = True

# instatiate the callback as an object
callbacks = myCallback()

# create the CNN model
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# compile the model
model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(lr=0.001),
              metrics=['acc'])

# image data generator for training set
train_datagen = ImageDataGenerator(rescale=1/255)
train_generator = train_datagen.flow_from_directory(
        "/tmp/h-or-s",  
        target_size=(150, 150), 
        batch_size=10,
        class_mode='binary')

# train the model with fit_generator()
history = model.fit_generator(
      train_generator,
      steps_per_epoch=2,  
      epochs=15,
      verbose=1,
      callbacks=[callbacks])

Conclusion

I would say that TensorFlow 2.0 has just got what it needs to make deep learning more accessible to developers who aren’t familiar with complex programming involved in TensorFlow 1.0. Integrating Keras into TensorFlow 2.0 is a great step for programmers like me who loves creating deep neural networks with Keras. This course is a prefect introductory course for people who wish to actually implement and see deep neural networks in code who aren’t more towards the math side of it. I highly recommend taking this course for beginners, enthusiasts and professionals as tools convert ideas to usable products.

References

In case if you found something useful to add to this article or you found a bug in the code or would like to improve some points mentioned, feel free to write it down in the comments. Hope you found something useful here.