30

Containerized AI for Anomaly Detection

 4 years ago
source link: https://www.tuicool.com/articles/UVVvY33
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In this post, we’ll take an AI neural network trained for anomaly detection and deploy it as a containerized REST API. Our use case is where externally collected sensor data is streamed to our API for near real-time anomaly detection analysis.

The details for creating and training the anomaly detection neural network model can be found in my previous post .

Here, we’ll develop a REST API using the Python Flask framework and then deploy it within a Docker container. We’ll use Kubernetes for exposing our API as a service and for orchestration of our containers. Finally, we will test it out and review our results.

Given the breadth of what we have to cover and in order to reduce TLDR, I will not delve too much in to the underlying theory and will assume the reader has some basic knowledge of the underlying technologies. I will provide links to more detailed information as we go and you can find the source code for this post in my GitHub repo .

Create the REST API

Our first task is to create a Python REST API using the Flask framework. The workflow I use, is to first create the Python-based API using the PyCharm IDE to confirm the API is functioning as expected. PyCharm also makes working in Python virtual environments a snap. I then hop over to Visual Studio Code for adding the Docker specific aspects to the project because I prefer its Docker integration. However, you could equally accomplish everything in either tool.

First, we import our Python libraries and instantiate our Flask application. Using a Python function, we define your TensorFlow graph and use Keras to load our pre-trained neural network model, Cloud_model.h5. We define our reconstruction error threshold for detecting an anomaly, limit = 0.275.

Fjimimz.png!web

Next, we define our REST API’s routes and allowed methods. We’ll us a function to process the submit message. Within the function, we create a data dictionary to house our JSON response message and extract the CSV data from the inbound request.

We then pre-process the submitted data prior to running it through our AI model:

  • Convert the CSV file to a Pandas dataframe
  • Add an index and set our data type
  • Normalize our data using the neural network scaler information
  • Reshape the data into a 3D tensor; this format is required for submitting input data into a LSTM neural network

qmyYjmQ.png!web

Now that we’ve extracted and pre-processed the submitted sensor data, we run it through the trained neural network model. We use the prediction results from the AI model to determine the data’s mean absolute error for each sensor reading. If a given sensor reading’s error exceeds our reconstruction error threshold, then an anomaly has occurred. We log any anomalies in our data dictionary and send the results back to the submitter in JSON format.

IbauIbE.png!web

Finally, we create the execution commands for the Python program.

73ieInr.png!web

Note we added “host=’0.0.0.0' to the run command. The default Flask behavior is to serve its service on the local host; however, this poses a problem when running inside a Docker container. By adding this simple tweak, the Flask service will now run on both local host and on an external IP. That was the hard part for our task; the rest is easy!

Docker Container

For this post, I will be running a local instance of Docker and Kubernetes on my workstation. Here is an excellent article on how to set those up locally on a Mac or Windows box.

If you prefer to use the Google Cloud for your Docker and Kubernetes instances, a few suggestions for you. First, create your Docker instance using the Google Compute Engine, then use the Google Kubernetes Engine for your Kubernetes cluster. While you could do everything on your Compute Engine instance, the Kubernetes Engine service handles all of the VM work and Kubernetes install for you, which is nice. Secondly, you’ll want to increase the default disk size. Each of our Docker containers will be around 1.95 GB and the default Kubernetes setting will spin up 3 containers.

To create our Docker container, we first create a “requirements.txt” file to document the software we’ll need loaded into our Docker container. In this case, a bunch of Python libraries.

rMFrI3r.png!web

Next, we create a “Dockerfile” that instructs Docker how to build our container.

3YJBZv6.png!web

Here, we tell Docker to:

  • Start with a Python 3.6 instance
  • Our working directory will be /app
  • Copy 4 files (our requirements.txt file, our Python API application, our trained neural network model, and our data normalization scaling information) to /app
  • Using the requirements.txt file, install our Python libraries
  • Finally, we tell Docker how to the start the app within our container

Now let’s build and run our Docker container; make sure Docker is running. Open a terminal instance and go to the directory where you’ve created your Docker files. Type the following command to build your Docker container (Note: the dot at the end matters!).

docker build -t anomaly-cloud:latest .

To run the container, type the below command. The “-p” tag tells Docker to make port 5000 externally available and to forward the local app to that port.

docker run -d -p 5000:5000 anomaly-cloud

Now that our REST API is up and running within a Docker container, let’s validate that it works. Here are the sensor data files that we’ll use to the test the API: day3_data & day4_data . Type the following command to submit sensor data to our API.

curl -X POST -F data_file=@day4_data.csv 'http://localhost:5000/submit'

You should get returned a boat load of anomalies denoted in JSON format.

Now list your running Docker containers using docker container ls and note your container’s ID. Stop the Docker container using:

 
 docker container stop your_container_id

Kubernetes Service

Now that we’ve got a REST API that runs within a single Docker container, lets expose it as a managed and load balanced service using Kubernetes. The approach we’ll use is to:

  • Upload / push our Docker container to the Docker Hub
  • Deploy the Docker container to our Kubernetes cluster by pulling it from Docker Hub
  • Create and expose the containerized API as a service

Upload Container to Docker Hub

First off, if you don’t already have one, create a free Docker Hub account. Once you’ve create your account, go back to your terminal and type docker login to which you should receive a positive response.

We now “tag” our Docker container, basically giving it a repository name. Run docker images in your terminal and note the Image ID for the anomaly-cloud container. We then tag our Docker container by typing:

docker tag your_ImageID your_docker_hub_ID/anomaly-cloud

Now we push the container to your Docker Hub account using:

docker push your_docker_hub_ID/anomaly-cloud

If you go to your Docker Hub account, you should now see a new repository for your AI API container named “your_docker_hub_ID/anomaly-cloud”.

Kubernetes Cluster

Kubernetes desktop is a single node cluster that provides Docker CLI integration. You can validate that Kubernetes is running on your desktop by typing kubectl cluster-info in your terminal. You will receive DNS and port information on your cluster. Next, let’s look at the cluster nodes using kubectl get nodes ; note the single docker-for-desktop node.

Kubernetes Dashboard

We will now open the Kubernetes dashboard for managing our cluster. The dashboard is awesome, but unfortunately, you have to jump through a few hoops to access it for the first time. Open a new terminal window and copy / paste the following into your terminal and execute it.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

To access the dashboard from your local desktop, you’ll need to create a secure channel to your cluster. Type kubectl proxy in your terminal and the dashboard login screen should open in a browser window. If the window doesn’t open automatically, you can access it in your browser at the IP and port displayed.

To log into the Kubernetes dashboard, you’ll need a bearer token. The Kubernetes cluster comes with a number of default service accounts, each with its own set of permissions. In a separate terminal instance type the following to see a list of the default service accounts.

kubectl -n kube-system get secret

Copy the service account name that starts with “admin-user” and then type:

kubectl -n kube-system describe secrets copied_service_account_name

Copy the provided bearer token and paste it into the dashboard login under the “Token” option; finally hit the sign in button. Store the token for use the next time you login into the dashboard. Whew …. that was painful!!

Deploy Container to Kubernetes

Now we deploy our Docker container to our Kubernetes cluster by having Kubernetes pull it from our Docker Hub repository.

 kubectl run anomaly-cloud --image=your_docker_hub_ID/anomaly-cloud --port 5000

In the dashboard, click on the Overview link. This will provide an high-level overview of your cluster status. Your deployment is now up and running within a single pod.

7jIvQ3z.png!web

You can click on the Pods or Replica Sets name links to get more detailed information on those items. The links are also available on the left hand side of the dashboard menu under Workloads. Click your Replica Sets name to see details about your replica set; note no services exist.

Expose Containerized API as a Service

Now let’s expose our REST API as a service. There are three Kubernetes service types to choose from:

  • ClusterIP: Exposes the service on the cluster internal IP and is only reachable from within the cluster.
  • NodePort: Exposes the service on each node’s IP as a static port (i.e. NodePort). The ClusterIP service to which the NodePort will route is automatically created.
  • LoadBalancer: This exposes the service externally using the providers load balancer. The NodePort and ClusterIP services to which the external load balancer will route are automatically created.

Here, we’ll use a LoadBalancer service.

kubectl expose deployment anomaly-cloud --type=LoadBalancer --port 80 --target-port 5000

In your dashboard, go back to Replica Sets and click your Replica Sets name. Your replica set is now exposed as a service via port 80.

ZzY7b2y.png!web

At this juncture, we are running our service on a single pod instance. We will now scale to 3 pod instances to enable true load balancing for our exposed service. Execute the command below in your terminal and then look at your dashboard. We now have 3 pods instances running our containerized REST API service.

kubectl scale --replicas=3 deployment/anomaly-cloud

qEvMRfu.png!web

After all of this hard work, let’s confirm that our anomaly detection neural network model is exposed as a containerized, Kubernetes managed, RESTful service. Type the below command in your terminal or use Postman to see the API response when we submit sensor data that contains no anomalies.

curl -X POST -F data_file=@day3_data.csv 'http://localhost/submit'

Your JSON response should be:

RRRbIfE.png!web

Now we test our AI REST API service when it receives sensor readings that contain anomalies.

curl -X POST -F data_file=@day4_data.csv 'http://localhost/submit'

It works! Your JSON response should be a long list of anomaly detections.

BVfUNvm.png!web

If you have made it this far, congratulations! You have now deployed an AI deep learning anomaly detection model as a Docker container REST API within a Kubernetes managed service.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK