YOLO v3 Object Detection with Keras

An Explanation of YOLO v3 in a nutshell with Keras Implementation

Jun 15 ·5min read

Video by YOLO author,

Joseph Redmon

About YOLO v3 Algorithm

“You Only Look Once” (YOLO) is an object detection algorithm that is known for its high accuracy while it is also being able to run in real-time due to its speed detection. Unlike the previous algorithm, the third version facilitates an efficient tradeoff between speed and accuracy simply by changing the size of the model where retraining is not necessary.

Before we start to implement object detection with YOLO v3, we need to download the pre-train model weights . Downloading this may take a while, so you can prepare your coffee while waiting. YOLO v3 is written in the DarkNet framework which is open-source Neural Network in C. This makes me feel so intimidated in the first place.

But thankfully, this code is strongly inspired by experiencor’s keras-yolo3 projec t for performing the YOLO v3 model using Keras. Throughout this whole implementation, I am going to run this on Google Colab. Besides, we are going to using these cute dogs image for object detection.

e6vauqa.jpg!web

Photo by Alvan Nee on Unsplash

So let’s get our hand dirty!!

Step 1:

Jump into the very first step, the following are the necessary libraries and dependencies.

Step 2:

Next, the WeightReader class is used to parse the “yolov3. weights” file and load the model weights into memory.

Step 3:

YOLO v3 is using a new network to perform feature extraction which is undeniably larger compare to YOLO v2. This network is known as Darknet-53 as the whole network composes of 53 convolutional layers with shortcut connections (Redmon & Farhadi, 2018) .

UziuayI.png!web

YOLO v3 network has 53 convolutional layers (Redmon & Farhadi, 2018)

Therefore, the code below composes several components which are:

_conv_block function which is used to construct a convolutional layer
make_yolov3_model function which is used to create layers of convolutional and stack together as a whole.

Step 4:

Next, the following code is explained as below:

save

Step 4:

This step involves decoding the prediction output into bounding boxes

The output of the YOLO v3 prediction is in the form of a list of arrays that hardly to be interpreted. As YOLO v3 is a multi-scale detection, it is decoded into three different scales in the shape of (13, 13, 225), (26, 26, 225), and (52, 52, 225)

EZ3Qjaq.png!web

A slice of YOLOv3 prediction output before it gets decoded

decode_netout function is used to decode the prediction output into boxes

In a nutshell, this is how the decode_netout function work as illustrated below:

An Explanation of YOLO v3 in a nutshell with Keras Implementation

About YOLO v3 Algorithm

Step 1:

Step 2:

Step 3:

Step 4:

Step 4:

Recommend

Value Iteration for V-function

From directory deletion to SYSTEM shell

Silq is a new high-level programming language for quantum computers

LaTeX Typesetting – Part 1 (Lists)

使用Apache Spark和Apache Hudi构建分析数据湖

微服务技术栈：常见注册中心组件，对比分析

Pkg.go.dev is open source!

Deep learning on graphs: successes, challenges, and next steps

Introducing Windows Insider Channels

Guide to Write Recursive Functions using Python

About Joyk