Articles in this Series

Preparatory Steps

Basic Classification (1/4) - Classifying Clothing on the Mac Mini M1

Basic classification (2/4) - Tensorflow Lite Model Conversion (this article)

Basic Classification (3/4) - Machine Learning on TI's EdgeAI Cloud

Interlude: TensorFlow Models on the Edge

Basic Classification (4/4) - EdgeAI Cloud - Jacinto ML Hardware Acceleration


Classification on the AI Edge

(5) SK-TDA4VM:Remote Login, Jupyter Notebook and Tensorflow.js

(6) SK-TDA4VM Starter Kit: Fashion Classification DNN

(7) -  Category List



In the previous article, part one of the series on basic machine learning  on the SK-TDA4VM Starter Kit, we trained a machine learning model to classify clothing, using the MNIST Fashion dataset. Although, the classification was performed on a Mac Mini M1, simply to test our code, any PC would have sufficed  for the task.  In all subsequent parts of this series we will be using the same or slight variations of the same code to perform the same classification, leading up to an implementation on the SK-TDA4VM. A summary of the model used in the previous training and classification exercise is shown in the Figure, below.


Figure: Summary of the TensorFlow Fashion MNIST machine learning model.


The TDA4VM's C7x and C66x DSP Processors

Now, the Jupyter Notebook developed, during the last exercise, should and could be run on the starter kit. Although, it would only utilise the kit’s CPU, that is the dual ARM Cortex-A72 processors. However, the purpose of this series is to use the starter kit on the AI Edge, executing inferencing tasks, as quickly as possible. On the artificial intelligence Edge we want to utilise the kit’s 8 TOPs hardware accelerators.


TensorFlow Lite is an open source deep learning framework for on-device inference. It provides a set of tools that enable on-device machine learning on embedded devices and systems, allowing them to run specialised  trained models.


Hence, it is required to run machine learning  inferencing  on the TDA4VM’s C66x and C7x DSP hardware accelerators. One way to delegate the workload, to utilise the hardware accelerators, is to  convert the machine learning model (developed in the last part of the series) into a TensorFlow Lite one. The process of converting our 32-bit floating-point Tensorflow model , which uses the A72 processors, into an 8-bit unsigned integer Tensorflow Lite model that delegates inferencing to the C66x and C7x DSP accelerators is the topic of this article.


 Figure: The TDA4VM processor consists of a number of processing units. The C66x and C7x DSP processing units are required to achieve  machine learning inferencing performance of up to 8 TOPS. 


TensorFlow delegates enable hardware acceleration of TensorFlow Lite models by leveraging on-device accelerators such as the Graphical Processing Units (GPUs), Digital Signal Processor (DSP) units and edge Tensor Processing Units (TPUs).



In this Section we proceed in converting our machine learning model in the following manner.  Firstly, we perform the same model training exercise, as before. Although, this time we do so  less elaborately, as now, presumably,  we have some inkling as to where we are heading. Next, we convert the model into an 8-bit TFLite model, after configuring the TFLite model’s optimisation mode parameter. We can optimise our model for speed or size, in this exercise we choose the former. Finally, we test the newly converted 8-bit TFLite model in a TFLite interpreter, which is made available as part of the Tensorflow tool suite. The Tensorflow to TFLite conversion workflow is summarised in the flowchart, shown in the Figure, below.


 Figure: Flowchart Tensorflow to Tensorflow Lite Model Conversion.


 Results & Observations

In this section we carry out a brief analysis of what has been achieved, so far. A simple TFLite model has been created that has three layers, as shown in the model summary figure above.   The Tensorflow tools can be used to export the deep learning model to file. The  directory structure of the saved Tensorflow model is shown in the Figure, below.  It will be shown in the Jupyter Lab notebook, further below, that the exported TFLite model is about 90% smaller than its Tensorflow model equivalent.


Figure: Saved model file structure.


The classification accuracy of the TFLite model, in this instance, is only slightly less than  the full blown Tensorflow one.  Previously, the accuracy of the Tensorflow model was reported to be above 88%. Likewise, the accuracy of the TFLite model fluctuates, between 84% and 88%. It’s left to be seen how the accuracies compare, when we develop and use more complicated models.

Figure: The TensorFlow model accuracy is roughly the same, as the TFLite model accuracy.


The steps outlined in the flowchart, in Figure above, are implemented in the following Jupyter Notebook. Extensive metrics on all models and inference  platforms used will be presented and compared in a future article.



In this article we have converted our Tensorflow model for classifying clothing, trained with the Fashion MNIST dataset, into a TensorFlow Lite one. In the next article in the series we will use our newly converted model on TI's Edge cloud server.


Jupiter Notebook Implementation

Below, we create our first Tensorflow Lite classification model.