10 Minutes to Building a CNN Binary Image Classifier in TensorFlow

How to build a binary image classifier using convolutional neural network layers in TensorFlow/Keras

Jul 6 ·5min read

This is a short introduction to computer vision — namely, how to build a binary image classifier using convolutional neural network layers in TensorFlow/Keras, geared mainly towards new users. This easy-to-follow tutorial is broken down into 3 sections:

The data
The model architecture
The accuracy, ROC curve, and AUC

Requirements: Nothing! All you need to follow this tutorial is this Google Colab notebook containing the data and code. Google Colab allows you to write and run Python code in-browser without any setup, and includes free GPU access!

1. The Data

We’re going to build a dandelion and grass image classifier. I’ve created a small image dataset using images from Google Images, which you can download and parse in the first 8 cells of the tutorial.

By the end of those 8 lines, visualizing a sample of your image dataset will look something like this:

yMBZfmq.png!web

Note how some of the images in the dataset aren’t perfect representations of grass or dandelions. For simplicity’s sake, let’s make this okay and move on to how to easily create our training and validation dataset.

The data that we fetched earlier is divided into two folders, train and valid . In those folders, the folders dandelion and grass contain the images of each class. To create a dataset, let’s use the keras.preprocessing.image.ImageDataGenerator class to create our training and validation dataset and normalize our data. What this class does is create a dataset and automatically does the labeling for us, allowing us to create a dataset in just one line!

2. The Model Architecture

At the beginning of this section, we first import TensorFlow.

Let’s then add our CNN layers. We’ll first add a convolutional 2D layer with 16 filters, a kernel of 3x3, the input size as our image dimensions, 200x200x3, and the activation as ReLU.

tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(200, 200, 3))

After that, we’ll add a max pooling layer that halves the image dimension, so after this layer, the output will be 100x100x3.

tf.keras.layers.MaxPooling2D(2, 2)

We will stack 5 of these layers together, with each subsequent CNN adding more filters.

Finally, we’ll flatten the output of the CNN layers, feed it into a fully-connected layer, and then to a sigmoid layer for binary classification.

Here is the model that we have built:

model = tf.keras.models.Sequential([# Note the input shape is the desired size of the image 200x200 with 3 bytes color# This is the first convolutiontf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(200, 200, 3)),tf.keras.layers.MaxPooling2D(2, 2),# The second convolutiontf.keras.layers.Conv2D(32, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),# The third convolutiontf.keras.layers.Conv2D(64, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),# The fourth convolutiontf.keras.layers.Conv2D(64, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),# # The fifth convolutiontf.keras.layers.Conv2D(64, (3,3), activation='relu'),tf.keras.layers.MaxPooling2D(2,2),# Flatten the results to feed into a DNNtf.keras.layers.Flatten(),# 512 neuron hidden layertf.keras.layers.Dense(512, activation='relu'),# Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('dandelions') and 1 for the other ('grass')tf.keras.layers.Dense(1, activation='sigmoid')

Let’s see a summary of the model we have built:

Model: "sequential" _________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= conv2d (Conv2D)              (None, 198, 198, 16)      448        _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 99, 99, 16)        0          _________________________________________________________________ conv2d_1 (Conv2D)            (None, 97, 97, 32)        4640       _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 48, 48, 32)        0          _________________________________________________________________ conv2d_2 (Conv2D)            (None, 46, 46, 64)        18496      _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 23, 23, 64)        0          _________________________________________________________________ conv2d_3 (Conv2D)            (None, 21, 21, 64)        36928      _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 10, 10, 64)        0          _________________________________________________________________ conv2d_4 (Conv2D)            (None, 8, 8, 64)          36928      _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 4, 4, 64)          0          _________________________________________________________________ flatten (Flatten)            (None, 1024)              0          _________________________________________________________________ dense (Dense)                (None, 512)               524800     _________________________________________________________________ dense_1 (Dense)              (None, 1)                 513        ================================================================= Total params: 622,753 Trainable params: 622,753 Non-trainable params: 0

Next, we’ll configure the specifications for model training. We will train our model with the binary_crossentropy loss. We will use the RMSProp optimizer. RMSProp is a sensible optimization algorithm because it automates learning-rate tuning for us (alternatively, we could also use Adam or Adagrad for similar results). We will add accuracy to metrics so that the model will monitor accuracy during training.

model.compile(loss='binary_crossentropy',optimizer=RMSprop(lr=0.001),metrics='accuracy')

Let’s train for 15 epochs:

history = model.fit(train_generator,steps_per_epoch=8,epochs=15,verbose=1,validation_data = validation_generator,validation_steps=8)

3. The Accuracy, ROC Curve, and AUC

Let’s evaluate the accuracy of our model:

model.evaluate(validation_generator)

Now, let’s calculate our ROC curve and plot it.

First, let’s make predictions on our validation set. When using generators to make predictions, we must first turn off shuffle (as we did when we created validation_generator) and reset the generator:

STEP_SIZE_TEST=validation_generator.n//validation_generator.batch_sizevalidation_generator.reset()preds = model.predict(validation_generator,verbose=1)

To create the ROC curve and AUC, we’ll need to compute the false-positive rate and the true-positive rate:

fpr, tpr, _ = roc_curve(validation_generator.classes, preds)roc_auc = auc(fpr, tpr)plt.figure()lw = 2plt.plot(fpr, tpr, color='darkorange',lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')plt.xlim([0.0, 1.0])plt.ylim([0.0, 1.05])plt.xlabel('False Positive Rate')plt.ylabel('True Positive Rate')plt.title('Receiver operating characteristic example')plt.legend(loc="lower right")plt.show()

IZJBbiy.png!web

ROC curve of our model

The ROC curve is a probability curve plotting the true-positive rate (TPR) against the false-positive rate (FPR).

Similarly, the AUC (area under curve), as shown in the legend above, measures how much our model is capable of distinguishing between our two classes, dandelions and grass. It is also used to compare different models, which I will do in future tutorials when I present how to build an image classifier using fully-connected layers and also transfer learning with ResNet!

Finally, at the end of the notebook, you’ll have a chance to make predictions on your own images!

3Qnaym3.png!web

you can now make predictions on your own images

I hope this gives you a gentle introduction to building a simple binary image classifier using CNN layers. If you are interested in similar easy-to-follow, no-nonsense tutorials like this, please check out my other stories!

1. The Data

2. The Model Architecture

3. The Accuracy, ROC Curve, and AUC

Recommend

如何用Spring WebFlux构建Reactive REST API

Abusing Linear Regression to Make a Point

Internationalizing and Localizing Your Flutter App [FREE]

Golem GitHub Digest #2: diving into the Golem Repositories

Modern Web Extension Development with TypeScript

Don't Panic! Better, Fewer, Syntax Errors for LR Parsers

使用Go基于WebSocket构建千万级视频直播弹幕系统-许少年

IP地址和子网掩码的计算

无线路由器，你究竟有多少小秘密？

可视化解释11种基本神经网络架构

About Joyk