A Quick Introduction to TensorFlow 2.0 for Deep Learning

zMzYN3E.jpg!web

After much community hype and anticipation, TensorFlow 2.0 was finally released by Google on September 30, 2019.

TensorFlow 2.0 represents a major milestone in the library’s development. Over the past few years, one of TensorFlow’s main weaknesses, and a big reason many people switched over to PyTorch , was its very complicated API.

Defining deep neural networks required far more work than was reasonable. This led to the development of several high-level APIs that sat on-top of TensorFlow including TF Slim and Keras .

Now things have come full circle as Keras will be the official API of TensorFlow 2.0. Loading data, defining models, training, and evaluating are all now much easier to do, with cleaner Keras style code and faster development time.

This article will be a quick introduction to the new TensorFlow 2.0 way of doing Deep Learning using Keras. We’ll go through an end-to-end pipeline of loading our dataset, defining our model, training, and evaluating, all with the new TensorFlow 2.0 API. If you’d like to run the entire code yourself, I’ve set up a Google Colab Notebook with the whole thing!

Import and Setup

We’ll start off by importing TensorFlow, Keras, and Matplotlib. Notice how we pull our Keras directly from TensorFlow using tensorflow.keras , as it’s now bundled right within it. We also have an if statement to install version 2.0 in case our notebook is running an older version.

Next, we’ll load up our dataset. For this tutorial, we’re going to use the MNIST dataset which contains 60,000 training images and 10,000 test images of digits from 0 to 9, size 28x28. It’s a pretty basic dataset that used all the time for quick tests and PoCs. There’s also some visualization code using Matplotlib so we can take a look at the data.

eyiimea.png!web

Visualizing MNIST digits

Creating a Convolutional Neural Network for Image Classification

The best way to do image classification is of course to use a Convolutional Neural Network (CNN). The tensorflow.keras.layers API will have everything we need to build such a network. Since MNIST is quite small — images of size 28x28 and only 60,000 training images — we don’t need a super huge network, so we’ll keep it simple.

The formula for building a good CNN has largely remained the same over the past few years: stack convolution layers (typically 3x3 or 1x1) with non-linear activations in-between (typically ReLU), add a couple of fully connected layers and a Softmax function at the very end to get the class probabilities. We’ve done all of that in the network definition below.

Our model has a total of 6 convolutional layers with a ReLU activation after each one. After the convolutional layers, we have a GlobalAveragePooling to get our data into a dense vector. We finish off with our fully-connected (Dense) layers, with the last one having a size of 10 for the 10 classes of MNIST.

Again, notice how all of our model layers come right from tensorflow.keras.layers and that we’re using the functional API of Keras. With the functional API, we build our model as a series of sequential functions. The first layer takes the input image as an input variable. Following that, each subsequent layer takes the output of the previous layer as its input. Our model.Model() simply connects the “pipeline” from the input to the output tensors.

For a more detailed description of the model, check out the print out of model.summary() down below.

Model: "model_1" _________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= input_3 (InputLayer)         [(None, 28, 28, 1)]       0          _________________________________________________________________ conv2d_12 (Conv2D)           (None, 28, 28, 32)        320        _________________________________________________________________ activation_16 (Activation)   (None, 28, 28, 32)        0          _________________________________________________________________ conv2d_13 (Conv2D)           (None, 14, 14, 32)        9248       _________________________________________________________________ activation_17 (Activation)   (None, 14, 14, 32)        0          _________________________________________________________________ conv2d_14 (Conv2D)           (None, 14, 14, 64)        18496      _________________________________________________________________ activation_18 (Activation)   (None, 14, 14, 64)        0          _________________________________________________________________ conv2d_15 (Conv2D)           (None, 7, 7, 64)          36928      _________________________________________________________________ activation_19 (Activation)   (None, 7, 7, 64)          0          _________________________________________________________________ conv2d_16 (Conv2D)           (None, 7, 7, 64)          36928      _________________________________________________________________ activation_20 (Activation)   (None, 7, 7, 64)          0          _________________________________________________________________ conv2d_17 (Conv2D)           (None, 7, 7, 64)          36928      _________________________________________________________________ activation_21 (Activation)   (None, 7, 7, 64)          0          _________________________________________________________________ global_average_pooling2d_2 ( (None, 64)                0          _________________________________________________________________ dense_4 (Dense)              (None, 32)                2080       _________________________________________________________________ activation_22 (Activation)   (None, 32)                0          _________________________________________________________________ dense_5 (Dense)              (None, 10)                330        _________________________________________________________________ activation_23 (Activation)   (None, 10)                0          ================================================================= Total params: 141,258 Trainable params: 141,258 Non-trainable params: 0 _________________________________________________________________

Training and Testing

Here comes the best part: training and getting our actual results!

First off, we’ll need to do a bit of data preprocessing to have the data properly formatted for training. Our training images need to be in an array of 4 dimensions with the format of:

(batch_size, width, height, channels)

We convert the images to type of float32 , a requirement for proper training, and normalize such that each pixel has a value between 0.0 and 1.0

As for the labels, since we are using Softmax activation, we’ll want our target output to be in the form of one-hot encoded vectors. To do so, we use the tf.keras.utils.to_categorical() function. The second variable in the function is set to 10 since we have 10 classes.

We selectAdam as our optimizer of choice — it’s super easy to use and works well out of the box. We set the loss function to be categorical_crossentropy which is compatible with our Softmax. Training the CNN is then as easy as calling the Keras .fit() function with our data as input!

Notice how all of this is almost purely Keras. Really the only difference is that we are using the Keras library from TensorFlow, i.e tensorflow.keras . It’s incredibly convenient as it comes in one nice package — the power of TensorFlow with the ease of Keras. Brilliant!

MNIST is an easy dataset so our CNN should reach high accuracy quite quickly. In my own experiments, it got to about 97% within 5 epochs.

Once training is complete, we can plot the history of the loss and accuracy. Once again, we use pure Keras code to pull the loss and accuracy information from the history. Matplotlib is used for easy plotting.

Import and Setup

Creating a Convolutional Neural Network for Image Classification

Training and Testing

(batch_size, width, height, channels)

Recommend

jQuery vs JavaScript. Why we Removed jQuery from our Templates?

Managing Your Page with TreeViews and Windows

7 Tools for Building Your Design System in 2020

GRIT: a Protocol for Distributed Transactions across Microservices

Build Colorful Command-Line Spinners in Nodejs

一次写shell脚本的经历记录——特殊字符惹的祸

DockOne微信分享（二三一）：玩转Kubernetes开发测试环境

史上最全k8s必学必会知识梳理

[译]Golang应付百万级请求/分钟

Spring Boot项目中如何定制HTTP消息转换器 - javaadu - 博客园

About Joyk