Serving Spark NLP via API: Spring and LightPipelines

Rest API for John Snow Labs’ Spark NLP

Welcome to a follow-up article on the “Serving Spark NLP via API” series, showcasing how to serve Spark NLP using Spring, Swagger, and Java.

Don’t forget to check the other articles in the series, namely:

How to serve Spark NLP using Microsoft Synapse ML (Part 1/3), available here.
How to server Spark NLP using FastAPI and LightPipelines (Part 2/3), available here.
How to serve Spark NLP using Databricks Jobs and MLFlow Rest APIs (Part 3/3), available here.

Background

Spark NLP is a Natural Language Understanding Library built on top of Apache Spark, leveraging Spark MLLib pipelines, that allows you to run NLP models at scale, including SOTA Transformers. Therefore, it’s the only production-ready NLP platform that allows you to go from a simple PoC on 1 driver node, to scale to multiple nodes in a cluster, to process big amounts of data in a matter of minutes.

Before starting, if you want to know more about all the advantages of using Spark NLP (such as the ability to work at scale on air-gapped environments, for instance) we recommend you to take a look at the following resources:

Motivation

Spark NLP is server-agnostic, which means it does not come with an integrated API server but offers plenty of options to serve NLP models using Rest APIs.

There is a wide range of possibilities to add a web server and serve Spark NLP pipelines using RestAPI. In this article, we will use Spring and Java.

This article shows only the essential steps to implement this solution with a deep focus on Spark NLP. You can check the implementation in detail in this Github repository.

Spring and Spark NLP Light Pipelines

Serving Spark NLP LightPipelines through Spring Boot

Spring offers a flexible and comprehensive set of libraries with tools like security, reactive cloud-based microservices for the web, or complex streaming data flow.

In this article, we use Spring Boot to accelerate application development. In addition, we will use Swagger to simplify API development and visualize our exposed endpoints.

As we read in previous articles, we can use Python-based frameworks such as FastAPI for serving Spark NLP pipelines. However, a drawback of Python-based frameworks is they require serialization and deserialization to interact with Spark, adding overhead compared with Java-based frameworks. Since Spring is a Java-based framework, it can achieve a faster response time than FastAPI.

Read more about the performance advantages of using LightPipelines in this article created by John Snow Labs Data Scientist Lead Veysel Kocaman.

Strengths

Lowest latency (even beats FastAPI)
Adds flexibility to build and adapt a custom API for your models
Easier scalability when containerizing along with tools like K8 or DC/OS

Weaknesses

LightPipelines are executed sequentially and don’t leverage the distributed computation that Spark Clusters provide.
As an alternative, you can use Spring with default pipelines and a custom LoadBalancer, to distribute the calls over your cluster nodes.

Prerequisites

To work with a Java-Spring project with Spark NLP, we will need:

Java8 or Java11(heads-up: Java 8 support will end by January 2023, check details here)

2. Because of transitive dependencies issues, we will need to exclude some dependencies from spark-sql artifact. Check all of those here.

Containerizing

You can serve Spark NLP + Spring on Docker. To do that, we will need to follow the steps below:

Dockerfile: To build a docker image for running a Spring Boot application, we will use the official OpenJDK Docker image, with these configurations:

Creating a spring user and a spring group to run the application.
Using a DEPENDENCY parameter pointing to a directory where we have unpacked our app fat JAR. This is possible thanks to the clean separation between dependencies and application resources that Spring Boot does to a fat JAR file.

FROM openjdk:11

RUN addgroup --system spring && adduser --system spring && adduser spring spring
USER spring:spring

ARG DEPENDENCY=target/dependency
COPY ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY ${DEPENDENCY}/META-INF /app/META-INF
COPY ${DEPENDENCY}/BOOT-INF/classes /app

ENTRYPOINT ["java","-cp","app:app/lib/*","jsl.LightNLPApplication"]

2. Docker Compose: It’s a tool usually used for running multi-container Docker applications. Here, we use a docker-compose.yml file to grab the environment variables required for Spark NLP for Healthcare.

version: '3'

services:
  jsl-light-nlp:
    ports:
      - "8080:8080"
    container_name: jsl-light-nlp
    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - SPARK_NLP_LICENSE=${SPARK_NLP_LICENSE}
    build: .

3. Dot-Env File (.env): A file used to put values into docker-compose.yml. So, here we will define the environment variables used by Docker Compose.

AWS_ACCESS_KEY_ID=MY_AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=MY_AWS_SECRET_ACCESS_KEY
SPARK_NLP_LICENSE=MY_SPARK_NLP_LICENSE

Implementation

As an example to maximize the performance and minimize the latency, we are going to store two Spark NLP pipelines in memory, so that we load only once (at server start) and we just use them every time we get an API request to infer.

To do this, we have to call some method during bootstrapping to load the required Spark NLP models. So, in the main application class, we call CacheModels.getInsance that loads all model data.

CacheModels.getInsance is a singleton class that encapsulates all logic required to store model data in memory.

CacheModel Class

The Modelclass is responsible for loading model data and downloading the models to our servers. Thus infer requests won’t need to download the model, reducing latency considerably.

Model Class

Now, to expose an endpoint that will process the infer requests, we will use a LightNLPController

LightNLPController Class

The controller calls annotate method from LightPipelineService which is responsible for making the calls to LightPipeline from Spark NLP

LightPipelineService Class

Under the hood, the method pretrained.annotateJava introduced since Spark NLP 4.0, uses the LightPipeline feature.

With this implementation, now the models will be ready and loaded in memory when our app is deployed on a server with a simple docker command:

docker-compose up --build

Since our app uses Swagger, we can check that our endpoints are up by accessing the Swagger UI URL.

Available Endpoints

We can check the available models in the GET /light/models endpoint using Swagger or this Curl command:

curl -X 'GET' 'http://localhost:8080/light/models' -H 'accept: */*'

We will get a JSON response with all available models:

HTTP Response for /light/models endpoint

Now, that we know the models available, let’s play around with them.

First, let’s send a request for the NER model. We can easily identify with Swagger that the/light/annotate endpoint expects a JSON request. So, we just fill the modelName field with the model we want to use and in the text field, we write the content.

HTTP Request for /light/annotate endpoint

The JSON response will include all stages defined in the NLP pipeline, in this case: sentence,token,ner_chunk

HTTP Response for /light/annotate endpoint

Finally, let’s make a request to/light/annotate endpoint for the DeIdentification model and check the response answer.

HTTP Request for /light/annotate endpoint

HTTP Response for /light/annotate endpoint

As we can see, the JSON response includes all stages defined in this NLP clinical pipeline, which are:

sentence,token,ner_chunk,masked,obfuscated,masked_fixed_length_chars,masked_with_chars

Do you want to know more?

Check the example notebooks in the Spark NLP Workshop repository, available here
Visit John Snow Labs and Spark NLP Technical Documentation websites
Follow us on Medium: Spark NLP and Veysel Kocaman
Write to [email protected] for any additional requests you may have

Serving Spark NLP via API: Spring and LightPipelines

Serving Spark NLP via API: Spring and LightPipelines

Background

Motivation

Spring and Spark NLP Light Pipelines

Strengths

Weaknesses

Prerequisites

Containerizing

Implementation

Do you want to know more?

Recommend

为什么中大型企业需要一体化智能人事软件—iTalentX

Samsung Galaxy F13's launch date, design, and key specs revealed

一直心痒 NAS，最终想想还是用旧笔电平替

How to Optimize for Fast Flow Using Alignment and Autonomy: the Journey of a Lar...

Be Weary When Users go Up and Profits Go Down

“天眼”射电望远镜探测到疑似外星文明信号，人类在宇宙中或不唯一

声阔智能眼镜合二为一带来双倍快乐_原创_新浪众测

事件协作和事件溯源

五味小面获数百万元天使轮融资，投资方为一埕实业、亚洲吃面公司、信联控股

阿里、腾讯、华为等大厂讨论“元宇宙是下一代互联网与商业场景”：云计算是不容忽视的基...

About Joyk