Image Augmentation Mastering: 15+ Techniques and Useful Functions with Python Co...

How to use it?

Maybe at this point, you don’t see how simple the setup is . And yet it is. All we have to do is define a list of the transformations we want to do on our sample and that’s it. We do not touch anything else afterward . Note that the order of the transformations will have its importance. It’s up to you.

We can now dive into the purpose of the article and see the image augmentation techniques.

Flip

The first, and one of the simplest , consists of randomly performing flips on the horizontal and vertical axes of the images. In other words, there is a 50/50 chance of performing a vertical flip and a 50/50 chance of performing a horizontal flip.

Crop

To do image augmentation, it is common to crop the image randomly. In other words, we crop a part of the image of random size and over a random area .

The size of the cropped image can be chosen from a ratio on the dimensions (height, width) . If the proportional maximum size of the crop is not specified then we will consider by default that it is the size of the image.

Kernel filters

General Case

We are going to get into something a little more enjoyable . Filters are great classics but I think it’ is important to be able to easily create our own convolution filters . If you do not know how a filter works I refer you to my article about Conv2d.

So I wanted to make a general function to be able to use our own filters.

Sharpen

As far as filters are concerned, it is possible to go even further by choosing a filter upstream and applying it with random weighting . For example, I introduce you the filter for sharpening our image.

value of center from 0 to 65.

Blur

To finish with the filters, the most popular are used to randomly blur our image. There are a lot of ways to blur our image. The best known are the average, median, Gaussian, or bilateral filters .

Average blur

kernel size from 1 to 35

Concerning average filter . As its name indicates: it allows us to average the values on a given center. This is made by a kernel . Its size can be specified for more or less blur. To increase our images with an average filter we just need to filter our input image with a kernel of a random size .

Gaussian blur

kernel size from 1 to 35

Finally in the same way as for the average blur. The Gaussian blur does not use an average filter but a filter so the values correspond to a Gaussian curve from the center. Note that the kernel dimension must contain odd numbers only.

Perspectives transformation

By far the most widely used image enhancement technique is perspective transformation . There are rotation , translation , shearing, and scaling . These transformations can be performed in a 3D dimension. Usually, they are used only in 2D which is a pity. Let’s take advantage of everything we have at our disposition, right?

Rotation

Translation

Shearing

Scaling

Combining Everything

I will not take more time on the 3D transformations of a 2D image because I wrote a whole article about it . So I picked up the function we get at the end of this article. I invite you to have a look at it if you want to know more about homogeneous coordinates and 3D transformation matrices .

What should be noted is that this function allows us to randomly perform transformations according to the 4 proposed matrices. The order has its importance. Here we have the shearing, then the rotation, then the scale, and finally the translation. Note that the translation is done by a ratio of the dimensions of the image.

Combining random rotation translation shearing and scale

Cutout

Cutout replacement by 0, on the whole input and cropping the target at the same time

The cutout is pretty intuitive . It involves removing regions of the input image at random. It works in the same way as the cropping we talked about earlier. But instead of returning the regions concerned, we delete them. We can, therefore, once again allow the user to provide a minimum and maximum size per ratio of regions to be deleted, a maximum number of regions , to cut the regions from the target at the same time or not, we can perform this cutout per channel , and also choose the default replacement value of the deleted regions.

Cutout replacement by 1, channel size on input without cropping the target

Color Spaces

Now we get to the part I find the funniest . A part that is very rarely taken into account . If we know the color spaces we can take advantage of their properties to enhance our images. To give you a simple example, with the HSV color space we can have fun extracting the leaf thanks to its color and change its color randomly according to our wishes. That is a very cool thing to do! And we understand the interest of having our own image enhancement functions. Of course, this requires a little more creativity . So it is important to know our color spaces to make the most of them. Particularly since they can be crucial in preprocessing for our (Deep) Machine Learning models.

Brightness

Brightness from -100 to 100

Let’s stay on our colors a little longer. A great classic in image augmentation is to be able to play with brightness . There are several ways to do so the simplest is to simply add a random bias .

Contrasts

Contrasts from -100 to 100

In the same way, it is very simple to play with contrasts . This can also be done randomly.

Noise injection

The last fairly common image enhancement technique is noise injection . In reality, we only add a matrix of the same size as our input. This matrix is composed of elements following a random distribution . Noise injection can be done from any random distribution. In practice, we only see two of them. But feel free to go further 

Uniform

Gaussian

Vignetting

Finally, much less used but not useless. Some cameras have a vignetting effect . It is also interesting to think about how we can increase our images by randomly imitating this phenomenon. We will also try to give flexibility to the user. We will be able to decide the minimum distance from the effect can randomly start, decide its intensity , and even decide if it’s an effect that goes towards black or toward white .

Lens distortion

And finally the best for last . I am surprised it is not used more often. But it can mimic the distortion of a camera lens . It is like looking through a round glass. What appears to us is distorted because the lens (the glass) is rounded. So if our images are taken from a camera with a lens why do we not simulate them. This should be used by default for image s. At least I think so.

I thus propose in this last function to be able to randomly simulate our lens distortion by playing on the radial coefficients k1, k2, k3 and on the tangential coefficients p1, p2 . In this method, the order of the coefficients is as follows: k1, k2, p1, p2, k3 . I invite you to have a look at the OpenCV documentation on this subject.

How to use it?

Flip

Crop

Kernel filters

General Case

Sharpen

Blur

Average blur

Gaussian blur

Perspectives transformation

Rotation

Translation

Shearing

Scaling

Combining Everything

Cutout

Color Spaces

Brightness

Contrasts

Noise injection

Uniform

Gaussian

Vignetting

Lens distortion

Recommend

应急响应之X系统数据库篡改应急分享

揭秘抖音上最赚钱的电商工具：有人日爆单1500万，有人月亏损30万

刷脸支付熄火：巨头的深谋远虑，为何只留下遍地炮灰？

中国市场“拯救”耐克，是否会引爆新一轮库存危机？

DevOps实施的五个关键点 - ThoughtWorks洞见

如何拿到阿里P8 Offer-候选人视角谈面试 - 知乎

WebRTC+MongoDB+Vue+Docker：全栈用开源项目，实现一个Slack | WebRTC中文网-最权威的...

Iceberg实践｜基于Apache Iceberg打造T+0实时数仓

HBase源码｜从源码层面理解HBase的请求队列参数

Where have all the RAD IDEs gone?

About Joyk