34

Making Deep Learning models ready for the worst-case scenario and cross-platform...

 4 years ago
source link: https://towardsdatascience.com/making-deep-learning-models-ready-for-the-worst-case-scenario-and-cross-platform-ready-with-c62284f87808?gi=2a4e204ab4b7
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

yya6Bfr.jpg!web

Photo by Fatos Bytyqi on Unsplash

Making Deep Learning models ready for the worst-case scenario and cross-platform ready with OpenVINO toolkit.

F3a2mii.jpg!web

Dec 31 ·8min read

As 2020 just arrived, the community of the deep learning experts and enthusiasts are looking forth to a significant year of innovation in the field. With a mounted figure of deep learning models being built every day around the world, the dependencies of the humankind on the Cloud and Network(especially TCP ) is expanding day-by-day. You might be considering, well now what’s wrong with Cloud dependencies?

Worst-Case Scenarios:

Reckon you have a face detection lock at your home which is improperly built, as the developer installed the model on the cloud and the device has to use the Cloud Services for inferencing. Now, suddenly a day comes when you are facing a remarkably terrible network connection and without any security-override method configured you will become the victim of your security system.

Another real-world instance of such a scenario is the story of a renowned multispeciality hospital located in Bhuvneshwar, Odisha, India. They drilled to have a deep learning network, properly trained and tuned with the domain expertise, but it was implemented in such a way that it had to send the heart-rate of the patient every second as a stream to a web server over TCP to determine the myocardial infarction. After a devastating cyclone hit the coastal Odisha, the system was of no use because there was no cellular connection at all.

If proper steps are not taken to deploy the deep learning models which are required to make critical decisions at any moment the model can face the worst ordeal. With the rapid headways of Deep Learning models in critical decision-making operations, if not configured with the edge case in mind, it can face identical buffeting circumstances. Immense problems can happen if security surveillance or health-care systems fail all of a sudden.

To make these models shielded from these concerns we need to implement these models in such a method that the models can perform real-time decisions without attaching to any other cloud services or the internet. This method is proved to be more secured as the deployed model is outside the reach of the Internet and thus workloads those require a maximum level of security can be implemented directly in the device. Enthusiasts call these AI models Edge AI. In this scheme, the model is directly placed in the device and they need no network connection for inferencing. We will now get to know how this is achieved.

Intermediate Representation:

The models we build and train using different frameworks such as Tensorflow, Caffe, Pytorch, ONNX etc. can be substantially large, resource-hungry and can also be architecture-dependent such as constrained to a specific platform or CPU/GPU kernels. To make these models be able to successfully provide inference from any device or from anywhere we need to convert the model in Intermediate Representation format, which includes the schema of the model in .xml format and the weights and biases of the model in .bin format.

Obtaining and transforming different models into IR format using OpenVINO toolkit:

The OpenVino Toolkit ( Open Visual Inference and Neural network Optimization toolkit ) is an open-source deep-learning toolkit originally developed by the OpenCV team includes different tools to convert different deep learning models into IR format using the model optimizer tool. In the process of converting models made out of different frameworks the model optimizer tool simply works as a translator which actually just translates the frequently used deep learning operations such as for Tensorflow we see, Conv2D, Conv3D, Dropout, Dense, BatchNormalization, etc., for Caffe we use convolution, dropout_layer, etc. to their respective similar representation in the OpenVino toolkit and tunes them with the associated weights and biases from the trained model. The Intel Distribution of OpenVINO toolkit has quite a large collection of different pre-trained models available on the following website which you can deploy to different devices. These pre-trained model can be directly downloaded by the model downloader tool. The pre-trained models downloaded using the model downloader tool already comes in the Intermediate representation format with different precision levels. These precision levels actually are the precision levels of the saved weights and biases of the model. Different precision levels include FP32(32 bit Floating Point), FP16(16-bit Floating-Point), INT16(16-bit Integer), INT8(8-bit Integer, available for only the pre-trained models) and many more. These precision levels are actually important because of ease of deployment into different platforms. Less precision makes less accurate results but the model takes much fewer resources to run, making it available for full deployment into edge devices without substantially hampering the performance of both the device and the model. Let us take a look at how we can use the model downloader to download pre-trained models from the Intel OpenVINO toolkit’s website and how to use them to get the inference on a given input.

The following is the link of the pre-trained model containing the documentation of preprocessing the inputs before feeding into the model.

Provided that the OpenVINO toolkit is installed and properly configured on your local machine let’s jump right into the procedure of downloading the above model. Head over to your OpenVINO installation directory and open the Terminal or Command Prompt with Administrator privileges. Now to download the above model issue the following command:

python C:/<OPENVINO_INSTALLATION_DIRECTORY>/openvino/deployment_tools/tools/model_downloader/downloader.py --name vehicle-attributes-recognition-barrier-0039 --progress_format=json --precisions FP16,INT8 -o \Users\<USER_ID>\Desktop

The above command uses the downloader.py python program which parses the command line arguments :

  1. — name: for providing the model name( if in place of — name, “ — all” is provided all the available pre-trained models will be downloaded),
  2. — precisions: for providing different precision levels (if nothing provided, all available precision level of the model will be downloaded)
  3. — progress_format=json: makes the format of the progress report in JSON format, which can be analysed by the program.

MzEvm2y.png!web

Downloading pre-trained models from OpenVINO toolkit already in Intermediate Representation format.

Check the Intermediate representation of the above model in .xml file it’s the architecture schema of the model and the .bin file contains the weights and biases. In the .xml file, you can see between the XML tags different layers and properties of the deep learning model can be perceived in the above format.

<layers>
............
<layer >  .......... </layer>
<layer> ...........</layer></layers>

uEJ7Vjr.png!web

.xml file of the pre-trained model

Inferencing With the Intermediate Representation:

Inferencing with the IR model format is super easy. For this model above we need to preprocess the image according to the input size and revert the colour channel. To the inference network, we need to use the .load_model() function with the model .xml file

from inference import Network
inference_network = Network()
inference_network.load_model("/<MODEL_DOWNLOAD_FOLDER>/vehicle-attributes-recognition-barrier-0039.xml","CPU",   "/<OPENVINO_INSTALL_DIRECTORY>/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_sse4.so")
inference_network.sync_inference(preprocessed_image)
output = inference_network.extract_output()

Now the output from the inference network needs to be processed and the maxima need to be selected by using argmax function. So, we need to process the output in the following way in order to determine the type of car and its colour and superimpose the text over the input image as a result of the inference.

def handle_car(output, input_shape):
    color = output["color"]
    color_class = np.argmax(color)
    car_type = output["type"]
    type_class = np.argmax(car_type)
    return color_class, type_class

JZNriaV.jpg!web

uqYzEf2.png!web

Input Image in the left. After the inference, the output is printed on top of the image.

Converting Tensorflow Models to Intermediate Representation :

In order to transform Tensorflow models into IR format, we need to obtain the model trained in Tensorflow saved in .pb format. And the rest is very very simple and easy to implement. In order to convert the model to the IR format using the OpenVINO model optimizer, the Tensorflow graphs need to be frozen. Freezing a Tensorflow model means deleting the preprocessing and training related metadata of the model to reduce the size of the model for easier deployment. Tensorflow provides built-in functions for freezing and unfreezing deep learning graphs. *.ckpt files contain the meta graph of the frozen TensorFlow model.

from tensorflow.python.tools import freeze_graph
freeze_graph.freeze_graph('Model.pbtxt', "", False,                           './Model.ckpt', "output/softmax",                           "save/restore_all", "save/Const:0",                           'Model.pb', True, "")

As the model is now frozen it can now be directly converted to an Intermediate Representation. Head over to terminal or Command Prompt with Administrator privileges and type in the following command:

python C:/<OPENVINO_INSTALL_DIRECTORY>/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model= /<MODEL_DOWNLOAD_DIRECTORY>.pb --tensorflow_use_custom_operations_config C:/<OPENVINO_INSTALL_DIRECTORY>/openvino/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json --tensorflow_object_detection_api_pipeline_config /<MODEL_DOWNLOAD_DIRECTORY>/pipeline.config --reverse_input_channels

We will be using the reverse_input_channels to reverse the colour channel order as OpenCV uses BGR channel instead of RGB format. In order to invoke the object detection API pipelines, we need to pass a pipeline.config file as a command-line argument to the flag tensorflow_object_detection_api_pipeline_config in order to properly configure the IR of the model. As in the example above and below we will be using a model that actually is a Single Shot Multibox Detector(SSD) so we need to further specify the command using the tensorflow_use_custom_operations_config argument and passing in a configuration file in JSON format. We specify the model’s .pb file using the input_model argument. The conversion process will take significant time depending upon the depth of the network.

As an example, we download a pre-trained Tensorflow model using curl and extract the tarball using tar -xvf

UbMVzyq.png!web

RjuUneR.png!web

The Details of the Conversion Procedure can be seen in the image above. On successful execution, the file location of the Intermediate representation of the Tensorflow model can be seen above.

Converting Caffe Models to Intermediate Representation :

In order to convert Caffe Models to IR format, we don’t need to do any special type of preprocessing as we did in the case of TensorFlow models by freezing them. To convert to IR we just need to specify the location of the model file in *.caffemodel using the input_model parameter and if it contains a protobuf text file whose name is not equal to the model's name we need to specify it’s location using input_proto parameter.

For this example, below we downloaded a pre-trained Caffe model hosted in GitHub into our Linux machine. And issued the following command:

python <OPENVINO_INSTALL_DIRECTORY>/openvino/deployment_tools/model_optimizer/mo.py --input_model <NAME_OF_MODEL>.caffemodel --input_proto <NAME_OF_DEPLOYMENT_PROTOBUF_TEXT>.prototxt

b6rMnun.png!web

Conversion Procedure into IR from a model trained in Caffe.

So, in the above text, we discussed how we can by easy means transform large resource-hungry deep learning models into small autonomous systems using the direct deployment into devices with the help of OpenVINO toolkit. In this way of deployment, the data flow of the model becomes far more secure, faster and lighter. We can easily reduce the cost of handling sensitive information over the servers of cloud systems and we can deliver super agile AI experience through every device.

Wishing you all a very happy new year!

“Truth can be stated in a thousand different ways, yet each one can be true.” ~ Swami Vivekananda.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK