How I created the Workout Movement Counting App using Deep Learning and Optical...

How I created the Workout Movement Counting App using Dense Optical Flow and CNN

Don’t track the moves yourself, let the AI do it for you instead!

Soure Alora Griffiths , via unsplash (CC0)

I like doing workouts and different types of trainings, like crossfit, but when the training is too intense or too long, I notice that I often make mistakes while counting how many movements I make on each exercise, this might be either due to lack of concentration on movement counting task during training or subconscious overestimation of the number of moves performed. As a third year Computer Sciense BSc student I decided to solve this problem during my course work and created the Web App to count the number of moves performed during workout. In this article I would like to share my approach to this problem. You can find the full code of the app in the github repository .

The Algorithm

In order to perform movement counting, you have to know if the body moves up or down on each frame. Usually, to perform such kind of task I would need to use some RNN architecture, because, obviously, you can’t detect the direction of movement using one frame only. Is he moving up or down on this photo?

https://www.youtube.com/watch?v=7wblGkVQx3U

But I didn’t have enough training data for making a robust RNN model, as I had to prepare and label the data myself. I tried looking in the direction of PoseNet models, to get coordinates of each body part on each frame.

https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5

This approach wasn’t beneficial either, due to several reasons:

The model was performing well if I tried it in the same environment it was trained on (the same room, same video angle, same person), but to make a robust model just for one exercise I would still need a lot of training data.
The FPS of PoseNet without using a GPU card was really low.
On some frames the quality of the detected body parts was low.

All in all, I played a lot with different models, and they all gave a poor result in some way. This was until one day I learned about the Optical Flow algorithm and especially Dense Optical Flow implementation (the right part of the image below). In a nutshell, this algorithm tracks the movement of pixels along some number of consequent frames.

https://nanonets.com/blog/optical-flow/

The optical flow can be either estimated using some mathematical models, which are implemented, for example, in OpenCV library, or it can be directly predicted using Deep Learning, which gives far better results in the complex video scenes. In my implementation I decided to stick to the Dense Optical Flow algorithm, which was implemented in python-opencv package.

Here is how one push-up can be color coded with Dense Optical Flow.

https://www.youtube.com/watch?v=xoCKHx8Yyj4

As you see, Dense Optical Flow encodes movement down as the green color and movement up as the purple color. Thereby, knowing the color coded representation of each frame, I could easily build a simple CNN network to perform multiclass classification of the frames. I just stacked some Conv + Pooling layers in PyTorch, which resulted in the following simple architecture.

To train this model, I loaded and labeled by frame a few YouTube videos, I also prepared some push-up videos myself. Finally, I had a training set of the color coded images, which consisted of 252 moving down frames, 202 non push-up frames and 206 moving up frames. I also prepared a small validation set consisting of 140 frames with different movements. After running a training loop for 10 epochs I got a pretty impressive graph of LogLoss for my model.

LogLoss of Train vs Validation sets

Obviously, it wasn’t too hard for the model, to predict for these 3 classes, because it can be easily done just by looking at the color coded images by eye.

What was more important, is the fact, that the trained model was able to classify frames, not only for push-ups but for burpees, squats and pull-ups as well. In genereal, I guess this exact model can easily classify all movements with a high amplitude, that involve moving up and down.

Though, to classify some exercises like sit-ups, or some low amplitude dumbbell moves, it is better to collect a new training set and to retrain the current model.

The App

To apply my model in real life I created a small Web App using Django, where I could create a new workout and try my model in the “battle” environment. Here is how it looks like.

Main screen of the web app

In general, during training, I noticed an error around 2.5% for push-ups, squats and pull-ups. For burpee, the error was around 5% , due to the fact that the exercises involves more than one up-down movement. Here is how the model counts push ups during workout.

Conclusion

To conclude, this work was a great experience for me, as I had to make a lot of research and to test different hypotheses for the problem of movement counting during workout. My time tracker shows that right now I have spent around 75 hours on this app development, but who knows, maybe I will spend even more if I decide to continue the project and make it something bigger. Thank you for your read!

How I created the Workout Movement Counting App using Dense Optical Flow and CNN

Don’t track the moves yourself, let the AI do it for you instead!

The Algorithm

The App

Conclusion

Recommend

Using Fedora to quickly implement REST API with JavaScript

看完这篇 Session、Cookie、Token，和面试官扯皮就没问题了

2019年手机浏览器市场份额排行榜，极简的夸克未上榜

微软买下“史上最危险域名”

2020年4月编程语言排行榜：C语言直逼Java，但Scratch才是本月最亮的星

2019年度银行数字化转型大奖揭晓阿里云助攻6家机构获奖

全球首家光子芯片公司完成 2600 万美元 A 轮融资，玄武岩担任财务顾问

陈宁做客美团大学《春风大讲堂》谈生死挣扎的餐饮企业如何抓住救命稻草

Fork My Code, Please (2014)

SwiftPlayground-Pharo: Interact with Swift on Pharo

About Joyk