21

Python盘纪念币系列之二:识别验证码 04

 4 years ago
source link: https://flashgene.com/archives/84465.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

今天我们就要用这个数据集来训练神经网络了。

学习数据集

我们拿到任何一个数据集都要先进行观察。一是我们自己要学会分辨,这样才能更有针对性的指导神经网络来分类;二是要看我们要处理的问题的复杂度,这样也是便于了解我们的神经网络要有多复杂(或者多“深”)。

54466530bd5340619fe73adfe6eb02d6.png

上图是我们的数据集的截图。观察发现“0”、“1”、“9”,“I”,“O”这五个字符是没有图片的,那是我们的数据集错了吗?检查原始的验证码图片发现,确实没有这几个字符。其实认真想一下就能知道,这几个都是容易与别的字符产生混淆的字符,所以大概率是在生成验证码的时候就可以把它们剔除了,在这里也要为这个程序员的细心点个赞~另外,观察还发现每个字符文件夹下面的图片数量是差不多的,这样也是为了让神经网络能不偏不倚的为每一个字符寻找最优的参数。

设计神经网络

说了这幺多,终于要开始设计神经网络了。用 Python 编写神经网络的库有很多,比如 TensorFlowPyTorchKeras 等等,这里我们不讨论各自的优劣势,我的工作中用的是 Keras ,所以这里我们采用 Keras

因为是图像分类,所以我们使用在图像类任务中最常用到的神技——卷积神经网络(CNN)。

from keras.layers import Flatten, Input, Dropout, Conv2D, MaxPooling2D, Dense
from keras.models import Model
from keras.optimizers import Adam
def model(input_size, class_num):
    input = Input(shape=input_size)
    x = Conv2D(16, (3,3), activation='relu', padding='same')(input)
    x = MaxPooling2D((2,2), strides=(2,2))(x)
    x = Conv2D(64, (3,3), activation='relu', padding='same')(input)
    x = MaxPooling2D((2,2), strides=(2,2))(x)
    x = Conv2D(256, (3,3), activation='relu', padding='same')(input)
    x = MaxPooling2D((2,2), strides=(2,2))(x)
    x = Flatten()(x)
    x = Dense(1024, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(2048, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(class_num, activation='softmax')(x)
    model = Model(input=input, output = x)
    model.compile(optimizer=Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

这基本上就是一个最简单的CNN了,模型结构大致如下图:

c05992f20246e743310bef9e5be685b9.png

就是简单的卷积-池化-卷积-池化-卷积-池化-全连接-全连接-dropout结构,因为问题很简单,所以模型结构不需要多复杂。

训练神经网络

网络设计好了,就可以准备开始训练了,也就是想办法把训练图片喂到模型里面让它自动更新各项参数。因为我们前期其实已经做好了部分工作,所以只需要按照类别读取图片,然后输入到模型中区即可,读取图片并生成标签的代码如下:

image_path = './chars'
data = []
labels = []
imagePaths = []
for label in os.listdir(image_path):
    for image in os.listdir(os.path.join(image_path, label)):
        imagePaths.append(os.path.join(image_path, label, image))
# 拿到图像数据路径,方便后续读取
imagePaths = sorted(imagePaths)
random.seed(42)
random.shuffle(imagePaths)
# 遍历读取数据
for imagePath in imagePaths:
    # 读取图像数据
    image = cv2.imread(imagePath, 0)
    image = cv2.resize(image, (16, 16))
    image = np.expand_dims(image, axis=-1)
    data.append(image)
    # 读取标签
    label = imagePath.split(os.path.sep)[-2]
    labels.append(label)
# 对图像数据做scale操作
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
# 数据集切分
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.25, random_state=42)
# 转换标签为one-hot encoding格式
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)

训练模型的代码如下:

print("------准备训练网络------")
# 设置初始化超参数
EPOCHS = 50
BS = 16
# 建立卷积神经网络
model = model(input_size=(16,16,1), class_num=31)
H = model.fit(trainX, trainY, validation_data=(testX, testY), epochs=EPOCHS, batch_size=BS)

训练模型的代码反而最少,是不是发现训练一个神经网络其实根本就不难。

来看一下训练神经网络时的输出:

Train on 332 samples, validate on 111 samples
Epoch 1/50
 16/332 [>.............................] - ETA: 7s - loss: 3.4399 - accuracy: 0.0625
 32/332 [=>............................] - ETA: 4s - loss: 3.4547 - accuracy: 0.0312
 48/332 [===>..........................] - ETA: 3s - loss: 3.4442 - accuracy: 0.0208
 64/332 [====>.........................] - ETA: 2s - loss: 3.4401 - accuracy: 0.0312
 80/332 [======>.......................] - ETA: 2s - loss: 3.4368 - accuracy: 0.0250
 96/332 [=======>......................] - ETA: 2s - loss: 3.4366 - accuracy: 0.0208
112/332 [=========>....................] - ETA: 1s - loss: 3.4371 - accuracy: 0.0179
128/332 [==========>...................] - ETA: 1s - loss: 3.4373 - accuracy: 0.0156
144/332 [============>.................] - ETA: 1s - loss: 3.4358 - accuracy: 0.0139
160/332 [=============>................] - ETA: 1s - loss: 3.4337 - accuracy: 0.0188
176/332 [==============>...............] - ETA: 1s - loss: 3.4330 - accuracy: 0.0170
192/332 [================>.............] - ETA: 1s - loss: 3.4310 - accuracy: 0.0156
208/332 [=================>............] - ETA: 0s - loss: 3.4313 - accuracy: 0.0192
224/332 [===================>..........] - ETA: 0s - loss: 3.4325 - accuracy: 0.0179
240/332 [====================>.........] - ETA: 0s - loss: 3.4300 - accuracy: 0.0208
256/332 [======================>.......] - ETA: 0s - loss: 3.4315 - accuracy: 0.0195
272/332 [=======================>......] - ETA: 0s - loss: 3.4334 - accuracy: 0.0184
288/332 [=========================>....] - ETA: 0s - loss: 3.4341 - accuracy: 0.0208
304/332 [==========================>...] - ETA: 0s - loss: 3.4349 - accuracy: 0.0197
320/332 [===========================>..] - ETA: 0s - loss: 3.4315 - accuracy: 0.0281
332/332 [==============================] - 2s 7ms/step - loss: 3.4340 - accuracy: 0.0271 - val_loss: 3.4193 - val_accuracy: 0.0270

神经网络会在运行每一个Epoch时更新参数,这样不停更新,最后达到最优:

Epoch 50/50
 16/332 [>.............................] - ETA: 1s - loss: 0.0155 - accuracy: 1.0000
 32/332 [=>............................] - ETA: 1s - loss: 0.0132 - accuracy: 1.0000
 48/332 [===>..........................] - ETA: 1s - loss: 0.0259 - accuracy: 1.0000
 64/332 [====>.........................] - ETA: 1s - loss: 0.0289 - accuracy: 1.0000
 80/332 [======>.......................] - ETA: 1s - loss: 0.0247 - accuracy: 1.0000
 96/332 [=======>......................] - ETA: 1s - loss: 0.0271 - accuracy: 1.0000
112/332 [=========>....................] - ETA: 1s - loss: 0.0251 - accuracy: 1.0000
128/332 [==========>...................] - ETA: 1s - loss: 0.0243 - accuracy: 1.0000
144/332 [============>.................] - ETA: 1s - loss: 0.0230 - accuracy: 1.0000
160/332 [=============>................] - ETA: 1s - loss: 0.0234 - accuracy: 1.0000
176/332 [==============>...............] - ETA: 0s - loss: 0.0318 - accuracy: 0.9943
192/332 [================>.............] - ETA: 0s - loss: 0.0372 - accuracy: 0.9896
208/332 [=================>............] - ETA: 0s - loss: 0.0354 - accuracy: 0.9904
224/332 [===================>..........] - ETA: 0s - loss: 0.0395 - accuracy: 0.9866
240/332 [====================>.........] - ETA: 0s - loss: 0.0521 - accuracy: 0.9833
256/332 [======================>.......] - ETA: 0s - loss: 0.0491 - accuracy: 0.9844
272/332 [=======================>......] - ETA: 0s - loss: 0.0531 - accuracy: 0.9816
288/332 [=========================>....] - ETA: 0s - loss: 0.0510 - accuracy: 0.9826
304/332 [==========================>...] - ETA: 0s - loss: 0.0488 - accuracy: 0.9836
320/332 [===========================>..] - ETA: 0s - loss: 0.0488 - accuracy: 0.9844
332/332 [==============================] - 2s 6ms/step - loss: 0.0478 - accuracy: 0.9849 - val_loss: 0.0197 - val_accuracy: 0.9910

下面是整个训练过程中,各项参数值的曲线:

d80a4195c389f4c7481c8616e29ace52.png

简单的,就是在训练过程中,不论是训练集还是验证集,它们的损失值不断下降到无限接近于0,而模型的准确率则无限接近于1.

测试神经网络

我们随便拿两个字符来进行测试:

7650eddd74e4a030c9337ae239c5ef81.png

测试代码如下:

# 加载测试数据并进行相同预处理操作
image = cv2.imread('./test_chars/3/1.jpg', 0)
output = image.copy()
image = cv2.resize(image, (16, 16))
# scale图像数据
image = image.astype("float") / 255.0
image = np.expand_dims(image, axis=-1)
# 对图像进行拉平操作
image = image.reshape((1, image.shape[0], image.shape[1],image.shape[2]))
# 读取模型和标签
print("------读取模型和标签------")
model = load_model('./output/cnn.model')
lb = pickle.loads(open('./output/cnn_lb.pickle', "rb").read())
# 预测
preds = model.predict(image)
# 得到预测结果以及其对应的标签
i = preds.argmax(axis=1)[0]
label = lb.classes_[i]
# 在图像中把结果画出来
text = "{}: {:.2f}%".format(label, preds[0][i] * 100)
print(text)

输出结果为:

5c41b12944ab6dd3f067c6b95f2e85fb.png

再试一张:

81f42b2904b9b4447c1c0bb03312560d.png

输出结果为:

37a045835d3e1adcda714a7b5714373d.png

两次实验的结果都表明,我们的神经网络模型的性能是可以的。

后记

至此,验证码的识别就讲完了。

本系列的所有源代码都会放在下面的github仓库里面,有需要可以参考,有问题欢迎指正,谢谢!

https://github.com/TitusWongCN/AutoTokenAppointment

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK