27

Some Things I Wish I Had Known Before Scaling Machine Learning Solutions: Part I

 4 years ago
source link: https://www.tuicool.com/articles/ziyM7ni
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
aaq2qeM.jpg!web

Recently, I’ve been touring different conferences presenting a talk about best practices for implementing large scale machine learning solutions. The idea is to present a series of non-obvious ideas that result incredibly practical in the implementation of machine intelligence applications in the real world. All the lessons have been based on our experiences at Invector Labs working with large organizations and ambitious startups in the implementation of machine learning capabilities. During those exercises, we quickly realized that many of our assumptions of machine learning apps were really flawed and that there was a huge gap between the advancements in AI research and the practical viability of those ideas. In this two-part article, I would like summarize some of those ideas that hopefully will result valuable to machine learning practitioners and aspirational data scientists.

There are many challenges that surface in the implementation of real world machine learning solutions. Most of them are related to the mismatch between the lifecycle of machine learning programs and traditional software applications. With some exceptions, traditional software applications follow a relatively sequential model from design to production. Machine learning models, on the other hand, follow a circular lifecycle that include aspects such as regularization or optimization that have no equivalent in the current toolset of traditional software applications.

VZbA7n2.png!web

Each of the stages in the lifecycle of machine learning solutions introduces unique sets of challenges that have no equivalent in the traditional software world. Some of those challenges are non-trivial or even paradoxical and can be encountered on different shapes or forms. Some of the key areas of challenges are summarized in the following figure:

RRFV3am.png!web

The good news is that most of those challenges are solvable with the current generation of machine learning frameworks and tools. However, some of the solutions are far from obvious. Let’s look at some of the key challenges and solutions across the lifecycle of machine learning programs.

15 Lessons About Scaling Machine Learning Solutions

Strategy & Processes

Planning and strategizing is a key element in the adoption of machine learning best practices, specifically in large organizations. During the strategizing phase, there are a few challenges that become very visible:

Challenge: Data Scientist Make Horrible Engineers

No offense to the data science community intended :wink: but most data scientists don’t tend to think about engineering capabilities such as code readability, testing or deployment. As a result, many of the models created by data scientists need to be heavily refactored in order to be operationalized.

The most successful organizations that I’ve seen address the data scientists code quality challenge allocate a specific team to operationalize models. That team is often referred to as data engineering and their responsibility is to refactor and sometimes even rewrite data science models to make them production ready.

f6ZJNvR.png!web

Challenge: Neither Agile nor Waterfall Processes Work for Machine Learning

Agile and waterfall methodologies are the two biggest schools of thought when comes to software development. When applied to machine learning applications, waterfall models fall short as most of the requirements are not known upfront and estimating the time for creating a specific model is next to impossible. Similarly, agile methods fail as shorter iterations are often impractical for machine learning models.

Although I don’t claim to have any answers to the right methodology to use for machine learning applications, an approach that has been relatively effective is to dive the development processes both into segments that can be approaches using both agile and waterfall methodologies respectively.

6zANZbE.png!web

Data Engineering

Collecting and preparing datasets is one of the often underestimated efforts in machine learning solutions. In this phase, there are several challenges that need to be confronted by machine learning teams.

Challenge: Feature Extraction can Become a Reusability Nightmare

Feature extraction is one of the common aspects in the lifecycle of machine learning solutions. Conceptually, feature extraction focuses on identifying the key aspects of the data that can be used by machine learning models. While feature extraction is conceptually simple for a single model, the picture gets really complicated for organization building dozens of machine learning models that share a common set of features.

One of the most effective techniques I’ve seen to address the feature reusability challenge is to build a centralized feature store that maintains a persistent representation of the features used by the different machine learning models. This is the approach followed by stacks such as Uber’s Michelangelo.

2MrAV37.png!web

Challenge: Labeled Datasets are Incredibly Hard to Produce

Supervised learning models dominate the machine learning ecosystem and they typically require large volumes of labeled datasets. However, producing those datasets is incredibly difficult, resource intensive and typically results impractical for most organizations.

Automated data labeling is an effective approach to deal with the data labeling nightmare. The principle is to create routines that can probabilistically assign labels to training datasets. From the technology stacks in the market, project Snorkel is one that has been steadily gaining traction in this area.

jAzYFvn.png!web

Model Experimentation

Experimentation is the cornerstone of any machine learning development lifecycle. The ability to play and test different models and architectures many times represent the difference between success and failure in the machine learning world. However, experimentations also introduces its own set of challenges in the machine learning lifecycle.

Challenge: The Single Framework Fallacy

Large enterprise cherish the idea of technology consolidation and like to place their efforts into a small number of machine learning tools and frameworks. However, frameworks that are good for experimentation often fall short for production workloads and vice versa. As a result, it is very common for organizations to leverage different machine learning stacks for the experimentation and operationalization stages respectively which introduces certain levels of technical debt and fragmentation.

When comes to machine learning, optimizing for productivity is a better solution than optimizing for consistency. As a result, accepting a world in which companies use different machine learning frameworks should be the standard. An approach that we’ve seen effective is this area is to use an intermediate representation to port models across the different frameworks. ONNX is one of the most robust frameworks to facilitate that.

BbAr22b.png!web

In the second part, we will continue with new challenges and solutions of machine learning solutions in the real world.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK