Deep learning: Saving rainforests with TensorFlow

muiqAzf.jpg!web

Deforestation accounts for nearly 20% of all global carbon emissions, and is responsible for trillions of dollars of economic loss. With 80% of all Amazon timber being illegally logged, a solution for the detection and detterence of this is desperately required. However, there are a number of obstacles . Given the vast and dense nature of rainforests, rangers simply don’t have the resources or manpower to physically monitor thousands of acres.

A not-for-profit called Rainforest Connection , lead by Topher White, has devised an effective and resourceful solution. Old mobile phones are collected, equipped with solar panels and placed in the branches of trees where they listen for sounds of logging trucks and chainsaws. One mobile phone can detect illegal logging over a kilometer away, protecting over 300 hectares of rainforest and preventing 15,000 tons of CO2 release — more than what 3000 cars release in a year .

f6rimma.jpg!web

The detection algorithm behind this is driven by TensorFlow, a deep learning framework developed by the Google Brain Team. I used TensorFlow to develop a model that is at least 93% accurate at detecting distant chainsaw noises in the forest.

Step 1: Get data

Due to the lack of rainforests and illegal loggers in my immediate vicinity, I used the scaper library to simulate a rainforest soundscape. What this code does is overlay various chainsaw noises (from YouTube) on top of a variety of rainforesty noises (also from YouTube), including thunderstorms, animals and flowing water. In order to give a wide representation of possible chainsaw noises, the samples are randomly varied in volume and pitch.

Here are two example soundscapes, the first without a chainsaw and the second with. Can you hear it?

Step 2: Make images

While there have been great advances in visual deep learning, auditory deep learning remains primitive. A spectrogram is a visual representation of a sound, where the occurences of each frequency are plotted against time. Convolutional Neural Network are pretty good at recognising signals in images — like chainsaw noises in spectograms. In order to leverge the power of CNN’s, we have to convert the raw audio data into an image format using the librosa python library .

This an example spectrogram and it’s associated soundscape. See if you can hear some of the features:

2aABNjF.png!web

The spectogram for the below soundscape

Originally the images were 600x400 but this was way to much for my GPU to handle. Scaling down the images to 100x100 produced faster and more accurate results. There is a 50/50 split between soundscapes with chainsaws and soundscapes without.

Step 3: Train classifier

This snippet breaks the sample images into train, test and evaluation sets. There are 2000 images in total, 100 in evaluation, and the rest split 80/20 into train/test.

Then, we train the model on 2 CNN layers and a single Dense output.

2D convolution visualisation— source

Heres the network summary:

7Vfy6vr.png!web

The network was trained for about 2 minutes over 200 epochs on a GTX 1660Ti GPU. The results were fantastic — close to 97% validation accuracy and 0.057 loss with a pretty healthy looking curve.

QZzMNfI.png!web

7jueY32.png!web

On 100 evaluation images the network has never seen before, the model scored 93%!

Step 1: Get data

Step 2: Make images

Step 3: Train classifier

Recommend

Building a Production-Level ETL Pipeline Platform Using Apache Airflow

语义意图在语音机器人中的应用

Histogram Equalization — a simple way to improve the contrast of your image

深度资讯 | 亚马逊季度利润两年来首次下降，但这并不危险

9点1氪 | 小米、美团纳入港股通名单；滴滴招募顺路接单产品体验官；网易有道上市破发

尿检即可准确预测高危宫颈癌？「诺辉健康」探索肿瘤居家早筛更进一步

「快手商业化」的150亿元答卷

出海创投周报 | 拉美独角兽 Rappi 进军哥斯达黎加；Gojek 表示将为双重上市做准备

人人都在筋膜健身｜GymSquare

我们采访了五位从业者，试图挖掘2019年文娱宣发新趋势

About Joyk