Is Facial Recognition Technology Racist? State of the Art algorithms explained

Learn about the face recognition algorithms that have caused an uproar in the news.

Jun 30 ·9min read

vaeY7vV.jpg!web

Originally posted by the ACLU at https://www.aclu.org/news/privacy-technology/wrongfully-arrested-because-face-recognition-cant-tell-black-people-apart/

Does this man look familiar? This is Robert Williams who was misidentified by a police facial recognition system and had to spend a day under arrest. As this incident gets thrown around in the media, it’s important to remember that it’s easy to criticize technology while having limited or nonexistent information about how they work. Countless media sources have criticized every component of the technology, while the actual algorithm has remained enigmatic.

In this blog post, I will go through a state of the art face recognition algorithm in a way that caters to both experienced professionals and the common, uninformed reader. I hope that this post helps you understand the algorithm that is being criticized all over the news right now and helps you bring much-needed information into discussions about this controversy.

What is Face Recognition?

When I say the words “Face Recognition”, a variety of visuals should come to mind, many of which you may remember from James Bond or Mission Impossible movies where the protagonist’s team has to change the face database to allow the protagonist into the secret bunker. Or maybe you think of a country like China or North Korea using face recognition technology to violate people’s privacy.

The official definition of face recognition strips all of the pop cultures away. It is simply, the detection and classification of a person’s face. This implies that a facial recognition system should have two components, first detecting a face in an image, then finding the identity of the face.

Face detection is a very similar problem to object detection, except instead of the entities of interest being everyday objects, they’re the faces of individuals.
Face identification is the problem of matching a detected face to an identification image in a pre-existing database. This is the same database that the hackers change in every spy movie.

Face Detection

To understand how face detection works, let’s go through the state of the art algorithm, RetinaFace . Now casual readers, don’t run away at the mention of a paper. Don’t worry, in this blog, I’ll do my best to make the algorithm as intuitive as possible, while also refraining from the oversimplification that has plagued media.

The Retina Face algorithm is called an end-to-end, or single-stage detector in the lingo. If you’re familiar with object detection strategies, it is similar to the SSD or YOLO architectures.

Output Details

The RetinaFace algorithm outputs three pieces of information about the face detected:

The bounding box of the face, denoted by the bottom left corner of the box and its width and height.
Five facial landmarks denoting the locations of the eyes, nose, and mouth
A dense 3D mapping of points which is very similar to those your cell phone uses to recognize you for a feature like Face ID on iPhones.

Feature Extraction

eaymmiU.png!web

Feature Pyramid Networks for Object Detection ( arXiv:1612.03144 ) (CC)

Like most modern computer vision algorithms, RetinaFace uses deep neural networks as feature extractors. More specifically, RetinaFace uses the ResNet architecture along with Fully Pyramidal Networks(FPN) to produce a rich feature representation of the image.

Intuitively, you can imagine these features capturing different levels of abstract features in the image. In the face detection realm, this is equivalent to early features encoding edges, mid-level features encoding facial features such as eyes, mouths, noses, etc, and high level features encoding the faces themselves. The FPN simply allows the model to make use of both the high-level and low-level features, which greatly aids in detecting smaller faces in images.

Training

Training is the process by which a randomly initialized network is taught to perform its task. The process of training is similar to teaching a child to do well on a test. The child is given information about the topic and then is given some sort of an evaluation test to see how well it did. The training of deep neural networks is similar, except it is given labeled data, in this case, images where the faces are labeled, and the evaluation is done using a loss function. For a more detailed understanding of deep neural networks, see myblog post.

Often, the training process for deep learning models is the most important part. Entire papers have been written about the huge improvement a new loss function provides. The RetinaFace algorithm is no different. Let’s examine the loss function used to train RetinaFace.

Learn about the face recognition algorithms that have caused an uproar in the news.

What is Face Recognition?

Face Detection

Output Details

Feature Extraction

Training

Recommend

[Paper Summary] Distilling the Knowledge in a Neural Network

从5个维度总结Python数据结构的关系，发现了这些技巧

突发！印度封禁抖音、微信、快手等 59 款中国 App

虎牙创始人古丰加盟，百度直播跃进

微信私域里的“战争”

吉利汽车回A阳谋：一口吞下沃尔沃

马斯克称特斯拉或Q2实现收支平衡料交付10万辆

亚马逊承诺向一线员工支付5亿美元奖金含兼职员工

比特币高溢价的传说你还信？这家“伊朗交易所”不到两月竟骗4700BTC

2020，车圈遍是贾跃亭

About Joyk