An Approach to Edge Analytic through ONNX (Open Neural Network Exchange) | HCLTech

An Approach to Edge Analytic through ONNX(Open Neural Network Exchange)

An Approach to Edge Analytic through ONNX(Open Neural Network Exchange)
November 08, 2021

About the data and its current scenario

Today, data plays an important role in everyday life and is generated constantly. So, the need to analyze data results creating a huge number of machine learning models from various frameworks such as TensorFlow, Keras, PyTorch, and others. These models are the basic building blocks of any edge analytic product. But all of these do not inter-operate and cannot be accommodated in a small edge device considering limited storage and processing power. Each framework has its own operator sets and graphs or computational network to represent its model. But, to benefit from all the best possible models developed so far, a unified format like ONNX (Open Neural Network Exchange) can help.

The following figure illustrates the concept.

ONNX

What ONNX provides

ONNX consists of two components, the ONNX model and ONNX runtime.

ONNX model: An ONNX model is formed as an acyclic computation of dataflow graphs that takes input/output as tensors. It has a rich set of operators (both core operators and custom operators) implemented as computation nodes. There are many conversion utilities that convert the models created by the different frameworks (for example, Sklearn, Keras, Tensor flow, Pytorch) into ONNX models.

ONNX runtime: It is the execution suite that executes the model and performs prediction. But before the execution of the model, it does a lot of other tasks like elimination or fusion of nodes wherever possible. It also partitions the graph to get the best performance in a heterogeneous environment based on the execution capability. ONNX runtime uses a query interface to know the hardware acceleration libraries on which the model must run. Based on the hardware where the runtime will be executed, different libraries are used. For example, for CPU environment libraries such as BLAS (Basic Linear Algebra Subprograms), MLAS (Microsoft Linear Algebra Subprogram), MKL (Math Kernel Library), DNNL(Deep Neural Network Library) are used, while for GPU environment libraries like cuDNN (NVIDIA CUDA® Deep Neural Network) library can be used. It also enables the running of the same application on different hardware platforms by providing consistent APIs. Similar APIs are also available in a different language (like C/C++, Python, C#) so that the applications written in a specific environment can use the runtime easily. The memory footprint of the model and runtime is very low and useful for deployment on edge devices.

Generic ONNX workflow:

  1. The model generated by any specific framework is passed to an intermediate data processing module.
  2. The number of features and their types must be determined in this data processing module. 
  3. The framework-generated model, along with the data types and the number of features, is fed to the ONNX conversion utility.
  4. The ONNX conversion utility produces the model in ONNX format.
  5. There is also a set of utility programs that can be used to retrieve the information in human-readable format from the ONNX model.
  6. Test data can be prepared from the information retrieved from this utility program.
  7. Test data along with the ONNX model are fed to the ONNX runtime.
  8. Finally, the ONNX runtime executes the ONNX model in the runtime engine for the test data and provides the prediction.

 We suggest replacing this subhead with ‘The ONNX mechanism’

onx

Benefits of ONNX on the edge device

The edge device with ONNX-enabled capability can be taken to anywhere in the world, and inference can be made locally without any internet connectivity. This can help medical diagnostic in a remote village, predictive maintenance in an automobile at inaccessible positions, forecasting weather in coastal areas, providing suggestions/guidance in agriculture, financial assistance for investment, etc.

Get HCL Technologies Insights and Updates delivered to your inbox