Serverless Model Serving: OpenWhisk, Apache Spark and MLeap

This post is a summary of talks I gave at Open Source Summit 2017, Big Mountain Data Fall 2017, and Scale By the Bay 2017. It covers how to serve Apache Spark MLlib models as a resource-efficient bundle using MLeap for serialization and Apache OpenWhisk for HTTP and horizontal scalability. The end product of this post is a detailed how-to guide on using OpenWhisk with " Docker Actions ," a how-to guide on MLeap and very little in the way of coverage on MLlib.

An illustration of Serverless Machine Learning. Events hit the API Gateway and are routed to a coresponding model.

The Motivation

I've been using Apache Spark for over four years, and while it has many advantages, it's warts are well documented. One of the limitations of Spark is that it doesn't work great on smaller data. This limitation is partially by design, as Spark is designed to tackle large computation jobs at scale, and there are many perils to using it prematurely. There are a lot of opinions on the quality and breadth of models offered by Spark MLlib , but this post is not about its virtues and about how to serve these models. What if you want to use a model you trained on 700 TBs of data on a handful of HTTP requests per second?

An example of what I imagined.

A few years ago I wrote a blog aboutLivy, and it's somewhat designed to solve this problem. I could, for example, create a generic spark job that takes parameters over HTTP and returns the response. However, there are a few limitations to this approach. There is overhead for submitting Spark jobs, and there is some finite capacity with a Spark cluster. Using resources that could be spent crunching numbers or fitting models is a much better use than jobs that calculate values for one-off HTTP requests.

I went searching on an alternative way to solve this problem and eventually came across MLeap. MLeap is a project that, among other things uses models fit in MLlib and serializes it as a zip file, JSON or Binary for reuse in a Scala or Python API. So instead of running Spark jobs to utilize the 700 TB model, you can write a Play server in Scala or a Flask (Or Django, I guess) one for Python. Once I discovered this, the gears started to turn, and I wondered if I could take this a step further and deploy a Mleap model as a serverless function.

Around this same time, I started to use OpenWhisk for a project, and it clicked that I could (probably) run my MLeap model using a custom container on OpenWhisk. This blog post serves as a how-to for serving an MLLeap model as a serverless function on OpenWhisk from creating the custom container to deploying it on IBM Cloud Functions. Toward the end of the blog, I get into the limitations of doing this and offer some alternatives.

Operating the Airbnb model at scale.

OpenWhisk

OpenWhisk is an open source, event-driven, functions as a service (FaaS) platform. You can think of it as AWS Lambda , but open source. On OpenWhisk, you can write functions natively in Javascript, Python, Java, PHP, and Swift and support other languages by building your own Docker container (We use Scala in this post). The architecture of Open Whisk is quite complicated and contains a myriad of moving pieces. My talks covered quite a bit of this; however, the internals of OpenWhisk are outside of the scope of this post and is covered in detail in this post, and also this podcast.

Outside of the container and function implementations being treated the same by OpenWhisk, there isn't a lot that's special about the platform when compared to some of the proprietary or vendor backed options. One exciting thing about OpenWhisk is IBM Clouds FaaS implementation runs it. So you can develop Whisk actions locally and then deploy them to the cloud when ready. Using IBM Cloud isn't for everyone, but that's what we'll do for this post. I don't work for IBM (just as a disclaimer).

Apache OpenWhisk Architecture

Scala OpenWhisk Docker Container

Creating a Scala container with OpenWhisk Docker Skeleton container image is easier than I expected. You can take a detail look at what’s needed in the Dockerfile below, but I’ll explain it here as well. I use the docker skeleton for OpenWhisk, which is basically Alpine and Python with a flask server running. From there I use APK to install Java 8, and then manually install Scala. I also transfer the files for the OpenWhisk code over to the container.

Code

I borrowed an MLeap Bundle from the documentation. It's a model that estimates what you should charge for your Airbnb rental. It takes parameters about the home (Number of bedrooms, location, number of bathrooms, Square Feet) and calculates what an appropriate price-per-night would be. My goal is to serve that model on OpenWhisk. To get this to work, I needed a few things:

A way to parse and create JSON
A way to use the MLeap Bundles (Zip file in this face)
A way to communicate with the OpenWhisk process

I tinkered with a few dependencies but this is the SBT file I needed to accomplish this:

The Airbnb example expected a JSON array with inputs that looked roughly like this:

["NY", 6.0, 1250.0, 3.0, 50.0,30.0, 2.0, 56.0, 90.0, "Entire home/apt", "1.0", "strict", "1.0"]

For simplicity, I designed the data to come through from OpenWhisk to my Scala function to look like this:

{data: ["NY", 6.0, 1250.0, 3.0, 50.0,30.0, 2.0, 56.0, 90.0, "Entire home/apt", "1.0", "strict", "1.0"]}

MLeap is expecting the sequence I send it to be typed (More like an Any with implied types) so I have do this madness to it.

Once I’ve got this all typified I can pass the data to MLeap where I use the zip file to serve the model, take that value and return it as a dictionary.

Then I create a simple exec function that is an entry point and utilizes this object:

One clever but slightly unsatisfying part of these Docker Actions is its use of the Standard Out. You'll notice I added my OpenWhisk jar to the classpath on the container with an Exec file. This code executed in the Scala repl and it's standard out is channeled back to the caller. If there is an error when calling your script, OpenWhisk will pass that there was an error to the caller with no additional information. If it's successful, OpenWhisk expects the response (what’s printed to the standard out) to be a dictionary (or JSON object). When debugging the container, this came in handy and felt a bit like the console.log() flow familiar to Javascript developers.

An Illustration of the OpenWhisk Container Running Scala

After the code is satisfactory, the next step is to build the container and push it to docker hub (This is one of the few places IBM cloud functions can pull from). There is a script included in the repository that does this. Once the image is pulled, we can test it by sending an HTTP request to it:

docker run -d -p 8080:8080 jowanza/scalatest:latest

curl -H "Content-Type: application/json" -d '{"value":{"data":["NY", 6.0, 1250.0, 3.0, 50.0,30.0, 2.0, 56.0, 90.0, "Entire home/apt", "1.0", "strict", "1.0"]}}' localhost:8080/run

Next, let's deploy this thing!

An example of using MLeap, Spark and OpenWhisk together. Models are build using data in Hbase and run on Spark. The models are bundled into MLeap and run on OpenWhisk

Deploying on IBM Cloud Functions

I set up a free lite account on IBM Cloud for this demo. I've included some links to how to get the CLI set up in the resources section (there is a lot of the cloud authentication stuff that’s not worth getting into here. Once my image is on Docker Hub this is the only command I need to run to deploy it:

ibmcloud fn action create scalaexample --docker jowanza/scalaexample

Once up on IBM cloud you can hit the public URL or one with API-KEY authentication

Limitations

Container Size.We're rolling at 600 MB!. That's a lot, though not above the limitations of the platform. Apparently, a majority of this is the Java/Scala run time.
Startup times.The first innovation can take a long time. I've had some as long as 800 MS. You can work around this with some clever hacks but having that as a p99 latency might be a deal breaker.
Platform Limitations.My initial use case for this approach was for small scale model deployments, but what if you want to deploy at scale? I'm not 100% sure how well IBMs platform scales with larger containers. There quite a bit on the internet about IBM Cloud customers scaling up with their serverless functions but not much in specifics around using Docker like this. I link to an excellent paper about FaaS limitations below.

Future Directions

Since I started work on this project, other platforms (notably OpenFaas , KNative , and Fn , Lambda Layers ) have either been released publicly or have gained some traction in the field. Any of those platforms should be able to accommodate this workflow, and I’ve seen some exciting applications with Knative on the docket at conferences like KubeCon . Even with these more exciting options, I’m very impressed with the thoughtful design and ease of use of OpenWhisk, and IBM cloud functions. An area of interest I have is workflow or function orchestration (Think of AWS Step Functions). IBM Cloud Functions has Composer , which allows you to compose functions into State Machines. Some of my next blog posts will be about workflow management and orchestration and related to this work so stay tuned!

Resources

Sep 10, 2018

Building a Real-Time Bike-Share Data Pipeline with StreamSets, Kafka and MapD

Sep 10, 2018

Nov 13, 2017

Thoughts on IFTTT

Nov 13, 2017

Nov 6, 2017

Thoughts on Dremio

Nov 6, 2017

The Motivation

OpenWhisk

Scala OpenWhisk Docker Container

Code

Deploying on IBM Cloud Functions

Limitations

Future Directions

Resources

Recommend

Educating the Next Wave of Bitcoin Developers

腾讯组织架构整改引思考：中小团队要怎样搭建架构？ - 架构 - dbaplus社群：围绕Data...

知乎 CTO 李大海：知识内容平台 AI 技术应用思考

如何禁用RocketMQ TLSv1.0？

由一个 emoji 引发的思考

深入 Spring Boot：从 JVM 分析 hibernate-validator NoClassDefFoundError

Service Mesh——后 Kubernetes 时代的微服务

程序员的快速开发框架：Github上10大优秀的开源后台控制面板

Mounting a Kubernetes Secret as a single file inside a Pod

syntaqx/serve: a static http server anywhere you need one

About Joyk