10 Must-read Machine Learning Articles (April — May, 2020) - JOYK Joy of Geek, Geek News, Link all geek

10 Must-read Machine Learning Articles (April — May, 2020)

May 29 ·6min read

YRBR3iY.png!web

Image by author

There has been a lot of amazing work done in machine learning, as well as interesting datasets released in April and May of 2020. In this article, we will go over some of the biggest AI news, research papers, and open datasets from some of the world’s largest tech companies, including Microsoft, Facebook, Google, and Uber.

Whether you consider yourself a data science beginner, intermediate, or expert, there is interesting information to be learned from the articles below, regardless of your skill level.

Machine Learning News

Open-Sourcing BiT: Exploring Large-Scale Pre-training for Computer Vision

From the Google AI Blog, researchers introduce Big Transfer (BiT): General Visual Representation Learning, a new approach for pre-training on image datasets at scale.

vMRZFzI.png!web

Image via ai.googleblog.com

The model and datasets have been open sourced and there are download links within the article. Interestingly, the team states that pre-training on a larger dataset alone doesn’t always lead to greater model accuracy. In order to get the most benefit from pre-training on larger datasets, smaller model architectures must be expanded and computational budgets must be increased, along with longer training times.

2. Uber Winds Down its AI Labs: A Look at Some of Their Top Work

Many companies were negatively affected by the COVID-19 crisis, and Uber’s ride-sharing service definitely took a large hit. Due to the setbacks, the company has decided to close their AI Labs. They gave this official statement:

“Given the necessary cost cuts and the increased focus on core, we have decided to wind down the Incubator and AI Labs to pursue strategic alternatives for Uber Works.”

This article pays homage to the company’s AI Labs by going over some of the best work done by Uber’s AI team, including AI-generating Algorithms, POET(Paired Open-Ended Trailblazer), the PLATO Platform, and much more.

3. Objects Are the Secret Key to Revealing the World Between Vision and Language

Image via microsoft.com

From the Microsoft Research Blog, this article explains the great progress of vision-and-language pre-training (VLP) and its potential to train large models on image-text pair data. In the article, the research team also introduces Oscar (Object-Semantics Aligned Pre-training) to show how objects can be used as anchor points to draw semantic connections between images and text.

4. Introducing Holopix50k: A New Benchmark Dataset for Image Super-resolution

Image super-resolution has the potential to improve virtual reality, video surveillance, and numerous other technologies.

beiEJvM.png!web

Image via leiainc.com

To provide researchers with high-quality training data for image super resolution and depth estimation, Leia Inc. has recently released Holopix50k, the world’s largest “in-the-wild” stereo image pairs dataset.

5. Amazon Scientists Author Popular Deep-learning Book

One of the largest tech companies in the world, Amazon has been at the forefront of applying machine learning for enterprise purposes. With Amazon SageMaker, they provided data scientists with a platform to build and deploy ML models at scale. Now, some of the greatest data scientists at Amazon have released an open source book titled Dive into Deep Learning. The book teaches the ideas, mathematics, and code behind deep learning. Furthermore, the authors plan to keep updating the book based on comments and feedback from users.

6. Facebook Releases Hateful Memes Dataset and Challenge

Y7VFZ37.png!web

Image via ai.facebook.com

Facebook’s AI team has built and open sourced a new dataset to help researchers create models to detect hate speech. The “Hateful Memes” dataset has over 10,000 multimodal examples. Along with the dataset, they are launching a contest called The Hateful Memes Challenge, which is hosted by DrivenData and has a total prize pool of $100,000.

If you’re interested in hate speech detection, you should also check out this article highlighting Facebook AI’s progress in the field.

Machine Learning Guides and Feature Articles

7. How to Use Inaccurate Data for Machine Learning with WSL

For independent researchers and teams on a small budget, it is difficult to get access to large quantities of training data. One way around this issue is to use lower quality annotations that are easier to collect.

mQvIFfY.png!web

Image via lionbridge.ai

In this article, the writer explains ways to leverage weak annotations for machine learning through several Weakly Supervised Learning (WSL) techniques for image-based data.

8. The Story of How AI changed Google Maps

Google Maps has grown tremendously in the past two decades, with a continuously evolving user interface and tweaks to existing design and functionality. This article tells the story of how Google Maps has grown, specifically in terms of how Google has used machine learning to improve their platform.

Furthermore, the article also explains all of the acquisitions Google has made to get the Maps platform to where it is today, as well as future expansion plans for Google Maps.

9. How to Use Data Augmentation to 10x your Image Datasets

Written by the founder of AI Summer, this guide explains what data augmentation is, along with a guide and tips on how to use data augmentation to increase the size of your image datasets. From basic image manipulations to GAN-based data augmentation, this guide has interesting information for both beginners and intermediates in the field.

10. The Pandemic is Emptying Call Centers and AI Chatbots are Swooping In

Due to the quarantines and lockdowns imposed in many countries, a large number of companies were forced to close most of their call centers. However, for government organizations, the amount of callers increased during the pandemic and most teams didn’t have the staff to handle the increased volume.

This article tells the story of how the IT director of Ostego County, New York implemented the IBM Watson chatbot to handle COVID-19 inquiries, despite a large reduction in staff.

COVID-19 Articles and Resources

There has been a large amount of articles released about COVID-19, so we didn’t include any COVID-related articles in the sections above. However, if you are looking for such articles or COVID-19 datasets, below are a few resources that may be of use to you.

10 Must-read Machine Learning Articles (April — May, 2020)