Energy-based models and the future of generative algorithms

nQVFj2A.png!web

Background image by Ahmad Dirini

Editor’s note: The Towards Data Science podcast’s “Climbing the Data Science Ladder” series is hosted by Jeremie Harris. Jeremie helps run a data science mentorship startup called SharpestMinds . You can listen to the podcast below:

Machine learning in grad school and machine learning in industry are very different beasts. In industry, deployment and data collection become key, and the only thing that matters is whether you can deliver a product that real customers want, fast enough to meet internal deadlines. In grad school, there’s a different kind of pressure, focused on algorithm development and novelty. It’s often difficult to know which path you might be best suited for, but that’s why it can be so useful to speak with people who’ve done both — and bonus points if their academic research experience comes from one of the top universities in the world.

For today’s episode of the Towards Data Science podcast, I sat down with Will Grathwohl, a PhD student at the University of Toronto, student researcher at Google AI, and alum of MIT and OpenAI. Will has seen cutting edge machine learning research in industry and academic settings, and has some great insights to share about the differences between the two environments. He’s also recently published an article on the fascinating topic of energy models in which he and his co-authors propose a unique way of thinking about generative models that achieves state-of-the-art performance in computer vision tasks.

Here were some of my favourite take-homes from our chat:

In industry, there’s a lot of emphasis placed on solving problems in the simplest and fastest way possible, using out-of-the-box algorithms rather than coming up with cutting edge models. Deployment and data collection end up consuming more time than you might expect, and that’s something that can feel stifling — particularly if your role happens to be more oriented towards putting models into production than training them from scratch.
Academic research comes with its own pressures, however. Apart from pressure to publish, academic machine learning is getting incredibly competitive, with entry to PhD programs effectively restricted to people who already have published work in top journals and conferences. Getting into a PhD program takes dedicated focus, and it’s not for everyone, so it’s worth seriously asking yourself if you’re willing to make the 2+ year commitment it’s going to take to position yourself for them before settling on that path.
Two of the most common types of machine learning models are discriminative models, which return single numbers (like classifiers or regressors), and generative models, which produce new inputs (like image-generating GANs , or certain language models ). Generative models are especially cool because they’re forced to learn enough about the data they’re trained on to produce new samples, rather than just classify them. Intuitively, that should make them really effective at classification and regression tasks down the road (since you can certainly argue that the best way to learn about painting is to make new paintings for example). But that hasn’t happened: over time, people have become so focused on generating realistic samples that they’ve been paying less attention to the original promise of generative models for discriminative applications.
Will’s latest research proposes a powerful new way of thinking about generative models, which enables them to be applied more effectively to discriminative tasks. His team showed state-of-the-art results, and we unpack the research in depth during our chat.

You can follow Will on Twitter here , and you can follow me on Twitter here .

Recommend

Digit recognizer using CNN

28元红包，我到了前女友婚礼饱餐一顿

AutoML-Zero: How is Google’s new automated ML Algorithm.

Viaweb FAQ

The Fake Cisco

Automatic Schema Synchronization in NDB Cluster 8.0: Performance Schema Tables

华为期中考提前交卷，埋头备战期末

创作者死于视频

TensorFlow惊现大bug？网友：这是逼着我们用PyTorch啊！

腾讯回应“员工每天在岗不足8小时被辞”：多次旷工严重违纪

About Joyk