Victor Meunier

Engineering student

How to convert you Keras or TF model for the Neural Compute Stick 2

In this article, I'll show you how to convert your Keras or Tensorflow model to run on the Neural Compute Stick 2.

Prerequisites

I assume that you have a working development environment with the OpenVino toolkit installed and configured. If this is not the case, follow this guide for the Raspberry Pi 3 and this one for Ubuntu.

Starting with a Keras model

Let's say that you start with a Keras model, it can be either a .h5 file that described the whole model and weights, or separate files (model.json and weights.h5). You'll have to convert your Keras model to Tensorflow first, here's how to do it.

You can find the whole code, with the creation of a Keras model on my GitHub.

First, load your Keras model.

                    from keras.models import load_model
                    from keras import backend as K

                    # loading keras model
                    K.set_learning_phase(0)
                    model = load_model(keras_model)
                  

Then, convert it to a TF model and save it as a .pb file.

                    import tensorflow as tf

                    # create a frozen-graph of the keras model
                    frozen_graph = freeze_session(K.get_session(),
                                                  output_names=[out.op.name for out in model.outputs])

                    # save model as .pb file
                    tf.train.write_graph(frozen_graph, "TF_model/", "tf_model.pb", as_text=False)
                  

The magic really happens in the freeze_session function:

                    def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
                      from tensorflow.python.framework.graph_util import convert_variables_to_constants
                      graph = session.graph
                      with graph.as_default():
                        freeze_var_names = list(set(v.op.name for v in 
                                                  tf.global_variables()).difference(keep_var_names or []))
                        output_names = output_names or []
                        output_names += [v.op.name for v in tf.global_variables()]
                        input_graph_def = graph.as_graph_def()
                        if clear_devices:
                          for node in input_graph_def.node:
                            node.device = ""
                        frozen_graph = convert_variables_to_constants(session, input_graph_def,
                                                                          output_names, freeze_var_names)
                        return frozen_graph
                  

You now have a Tensorflow model named tf_model.pb in the TF_model/ directory.

Use Model Optimizer to convert the TF model

Now we can use the model optimizer to convert the Tensorflow model to a IR file that we can use on the Neural Compute Stick 2.

                      mo.py --data_type FP16 --framework tf --input_model TF_model/tf_model.pb --model_name IR_model --output_dir IR_model/ --input_shape [1,28,28,1] --input conv2d_1_input --output activation_6/Softmax
                  

Few things here:

  • --data_type is used to specify the precision you want to use. From what I know, FP32 will not work on the NCS2. Something less than FP16 should work but you might have poor results due to loss of precision
  • --input_model is the path to your TF model
  • --model_name is the name of the converted IR file you'll create
  • --output_dir is the directory where your IR file will be saved
  • --input_shape is shape of your input tensor. In this model case it's [1,28,28,1] because there's one image of size 28x28 with one channel (grayscale)
  • --input is used to specified the input layer of your model
  • --output is used to specified the output layer of your model
Note: If you don't know the input or output layer name, you can see it when training the Keras model with the output of model.summary(). You can also use Netron. For that, run
                      pip install netron
                    
and run:
                      netron -b TF_model/tf_model.pb
                    
head to http://localhost:<port> to see you're model and get the name of the input/output layers.

First layers of the Tensorflow model visualized with Netron

If everything goes according to plan, the output will show a summary of the arguments you've used and you should have an output like the one below and an IR file.

                    [ SUCCESS ] Generated IR model.
                    [ SUCCESS ] XML file: /home/mreliptik/Documents/dev/Keras_to_TF_NCS2/IR_model/IR_model.xml
                    [ SUCCESS ] BIN file: /home/mreliptik/Documents/dev/Keras_to_TF_NCS2/IR_model/IR_model.bin
                    [ SUCCESS ] Total execution time: 7.88 seconds.
                  

Make a prediction using the Neural Compute Stick 2

Now that we have an IR model, we can use that to run on the NCS2. In the GitHub repo, the file predict_mnist.py is used for that.

                    from openvino.inference_engine import IENetwork, IEPlugin

                    model_xml = "IR_model/IR_model.xml"
                    model_bin = "IR_model/IR_model.bin"

                    # Plugin initialization for specified device
                    plugin = IEPlugin(device="MYRIAD")

                    net = IENetwork(model=model_xml, weights=model_bin)

                    # Loading model to the plugin
                    exec_net = plugin.load(network=net)
                  

With this few lines, you first read the model IR files, you then initialize the device and create a network from the loaded files. With the last line you load the network to the device, that returns a network ready for inference.

Then simply use exec_net.infer() to run the inference.

                    res = exec_net.infer(inputs={input_blob: prepimg})
                  

Voila! You successfully converted a model to IR and ran it on the Compute Stick!

Final thoughts

The provided model optimizer is pretty forward to use. You still have to know what you're doing though. I thought that Intel would have developped something a bit more user friendly.

Next up

In the following weeks, I'll be working on my Handpose project to make it run with the NCS2. Stay tuned!

See you around for more!

Comments