Imports

Let s load some important libraries:

from keras.preprocessing.image 
import ImageDataGenerator, load_img from keras.models 
import Sequential from keras.layers 
import Conv2D, MaxPooling2D from keras.layers 
import Activation, Dropout, Flatten, Dense from keras 
import backend as K 
import os 
import numpy as np 
import pandas as np 
import matplotlib.pyplot as plt 
%matplotlib inline

Getting to know the data

Let s get to know the data, viewing two sample images, one in normal condition and another with pneumonia.

import matplotlib.pyplot as plt
img_name = 'NORMAL2-IM-0588-0001.jpeg'
img_normal = load_img('../input/chest_xray/chest_xray/train/NORMAL/' + img_name)
plt.imshow(img_normal)
plt.show()

img_name = 'person63_bacteria_306.jpeg' 
img_pneumonia = load_img('../input/chest_xray/chest_xray/train/PNEUMONIA/ ' + img_name) 
print('PNEUMONIA') 
plt.imshow(img_pneumonia) plt.show()

Preparing data to feed into model

Setting some important variables like images, epochs, etc.:

img_width, img_height = 150, 150
nb_train_samples = 5217
nb_validation_samples = 17
epochs = 20
batch_size = 16

The image width and image height are both 150 pixels. There will be 5217 samples to train, and 17 samples to validate (we will add more via data augmentation later). Validation data is data used to evaluate the loss function during training (opposed to test data, used to evaluate the metric after training). The training will run for 20 epochs, in batches of 16 images.

Specifying the directories for images:

train_data_dir = '../input/chest_xray/chest_xray/train'
validation_data_dir = '../input/chest_xray/chest_xray/val'
test_data_dir = '../input/chest_xray/chest_xray/test'

Lastly, the images need to be reshaped:

if K.image_data_format() == 'channels_first':
 input_shape = (3, img_width, img_height)
else:
 input_shape = (img_width, img_height, 3)

Because the image is in color, it has three separate color values for each pixel, hence the depth of 3. If the image were black-and-white like the MNIST dataset the depth would be 1.

Creating the Model

The model will be created along a standard CNN formula: several repetitions of convolution layer, activation layer, and pooling layer, followed finally by a flattening and a standard dense layer. A dropout layer was added at the end to further regularize, followed by another dense layer (surrounded by two activation functions).

model = Sequential()model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

For more on Keras layers and what they do, check out this article.

We can get information on the layers by calling model.layers.

We can also get an idea of what the inputs and outputs should be with model.input and model.output.

Output for model.inputOutput for model.output

Next, we must compile the model with a loss function, an optimizer, and a metric. In this case, the loss function of choice is binary cross-entropy (pretty much the universal choice). The optimizer of choice is rmsprop, which works well in images where the classification is dependent on very small changes in the image. The code to compile is as below:

model.compile(loss='binary_crossentropy',
 optimizer='rmsprop',
 metrics=['accuracy'])

Data Augmentation

There are only 17 images for validation so how will we get more data? The answer: data augmentation. We can use data augmentation to give us more data for training, validation, and testing.

train_datagen = ImageDataGenerator(
 rescale=1. / 255,
 shear_range=0.2,
 zoom_range=0.2,
 horizontal_flip=True)

To rescale, we need to test

test_datagen = ImageDataGenerator(rescale=1. / 255)

The following code uses flow_from_directory to directly apply the data generator to the images in the directory into the train set.

train_generator = train_datagen.flow_from_directory(
 train_data_dir,
 target_size=(img_width, img_height),
 batch_size=batch_size,
 class_mode='binary')

The following code generates code for validation:

validation_generator = test_datagen.flow_from_directory(
 validation_data_dir,
 target_size=(img_width, img_height),
 batch_size=batch_size,
 class_mode='binary')

And this one for test:

test_generator = test_datagen.flow_from_directory(
 test_data_dir,
 target_size=(img_width, img_height),
 batch_size=batch_size,
 class_mode='binary')

Doctor AI Diagnoses Pneumonia

Imports

Getting to know the data

Preparing data to feed into model

Creating the Model

Data Augmentation

Recommend

美团外卖涨佣困局：不断上涨，商家和美团却不赚钱

滴滴顺风车上线顺路同事新功能方便同事互助出行

柳叶刀正式发文：确认世界上第2例HIV治愈案例

马斯克回应“减配门”犯众怒国内车主欲集体上诉

比特币Merkle树和SPV机制

LSTM-FCN for cardiology

PHP Annotated – March 2020

“网红” WebAssembly 与 K8s 如何实现双剑合璧？

长文解析：带你解读阿里的大数据建设方法论

【Java必修课】判断String是否包含子串的四种方法及性能对比

About Joyk