Arm NN
Arm NN is an inference engine for CPUs, GPUs and NPUs. It bridges the gap between existing NN frameworks and the underlying IP. It enables efficient translation of existing neural network frameworks, such as TensorFlow and Caffe, allowing them to run efficiently, without modification, across Arm Cortex-A CPUs, GPUs (Arm Mali or any openCL 2.0) and Arm Ethos NPUs.
More information at https://developer.arm.com/ip-products/processors/machine-learning/arm-nn
Installation
Arm NN is packaged at science:machinelearning OBS project, and developed at https://github.com/ARM-software/armnn.
You can install ARM-NN from:
- OSS repo for Tumbleweed, Leap 15.2+ and PackageHub for SLE15 SP2+
- https://software.opensuse.org/package/armnn
Current status
Current (2020-09-29) options enabled on aarch64 Arm NN:
* openCL support (GPU) has been tested on openSUSE Tumbleweed with HiKey960 board which includes a Mali Bifrost G71, with openCL 2.x support. According to [1] and [2], it requires a GPU with openCL 1.2 + cl_arm_non_uniform_work_group_size (for better performances openCL 2.0). Upstream tests are on Mali GPU.
Known issues
openCL backend
- If you do not have /usr/lib64/libOpenCL.so file, you must link it to libOpenCL.so.1 file:
sudo ln -s /usr/lib64/libOpenCL.so.1 /usr/lib64/libOpenCL.so
- openCL backend currently supports only GPU type device. So, you cannot use CPU or ACCELERATOR devices.
See: https://github.com/ARM-software/armnn/issues/286
- If you get the following error:
An error occurred when preparing the network workloads: in create_kernel src/core/CL/CLKernelLibrary.cpp:1087: Non uniform workgroup size is not supported!!
you have an openCL 1.x without cl_arm_non_uniform_work_group_size support, which is mandatory for ArmNN openCL.
Tools
ExecuteNetwork
The ExecuteNetwork program, from Arm NN takes any model and any input tensor, and simply prints out the output tensor. Run it with no arguments to see command-line help.
ArmnnConverter
The ArmnnConverter program takes a model in any input format and produces a serialized model in Arm NN format (*.armnn). It allows to run this model without ad-hoc parser, just native Arm NN format. Run it with no arguments to see command-line help. Note that this program can only convert models for which all operations are supported by the serialization tool src/armnnSerializer.
ArmnnQuantizer
The ArmnnQuantizer program takes a 32-bit float network and converts it into a quantized asymmetric 8-bit or quantized symmetric 16-bit network. Static quantization is supported by default but dynamic quantization can be enabled if CSV file of raw input tensors is specified. Run it with no arguments to see command-line help.
Tests/Examples
SimpleSample
Run SimpleSample and enter a number when prompted (here 458):
Please enter a number: 458 Your number was 458
Caffe backend
CaffeInception_BN-Armnn
CaffeInception_BN-Armnn example uses a Caffe model on top of ARM-NN for image classification. You need to get the data and the model, so please download:
- https://raw.githubusercontent.com/pertusa/InceptionBN-21K-for-Caffe/master/deploy.prototxt and http://www.dlsi.ua.es/~pertusa/deep/Inception21k.caffemodel to models/ folder.
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
Arm NN is not able to use this model as is and it should be converted:
- batch size to be set to 1 (instead of 10)
- Arm NN does not support all Caffe syntaxes, so some previous neural-network model files require updates to the latest Caffe syntax
So, you need to:
- Copy deploy.prototxt to deploy_armnn.prototxt and update the file to set the batch size to 1:
--- models/deploy.prototxt 2019-10-01 13:25:13.502886667 +0000 +++ models/deploy_armnn.prototxt 2019-10-01 13:38:55.860972787 +0000 @@ -3,7 +3,7 @@ layer { name: "data" type: "Input" top: "data" - input_param { shape: { dim: 10 dim: 3 dim: 224 dim: 224 } } + input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } } } layer {
- and run the following convert.py script from the 'models/' folder (requires python3-caffe):
#!/usr/bin/python3 import caffe net = caffe.Net('deploy.prototxt', 'Inception21k.caffemodel', caffe.TEST) new_net = caffe.Net('deploy_armnn.prototxt', 'Inception21k.caffemodel', caffe.TEST) new_net.save('Inception-BN-batchsize1.caffemodel')
Now, you can run CaffeInception_BN-Armnn --data-dir=data --model-dir=models :
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 3694 with value: 0.255735 Top(2) prediction is 3197 with value: 0.0031263 Top(3) prediction is 1081 with value: 0.000757725 Top(4) prediction is 567 with value: 0.000526447 Top(5) prediction is 559 with value: 9.72124e-05 Total time for 1 test cases: 0.088 seconds Average time per test case: 88.260 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
CaffeResNet-Armnn
CaffeResNet-Armnn example uses a Caffe model on top of ARM-NN for image classification. You need to get the data and the model, so please download:
- https://onedrive.live.com/?authkey=%21AAFW2-FVoxeVRck&cid=4006CBB8476FF777&id=4006CBB8476FF777%2117895&parId=4006CBB8476FF777%2117887&o=OneUp and rename it from RestNet-50-model.caffemodel to ResNet_50_ilsvrc15_model.caffemodel and move it to models/ folder.
- https://raw.githubusercontent.com/ameroyer/PIC/d136e9ceded0ceb700898725405d8eb7bd273bbe/val_samples/ILSVRC2012_val_00000018.JPEG to data/ folder
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
And run CaffeResNet-Armnn --data-dir=data --model-dir=models :
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 21 with value: 0.466987 Top(2) prediction is 7 with value: 0.000633067 Top(3) prediction is 1 with value: 2.17822e-06 Top(4) prediction is 0 with value: 6.27832e-08 = Prediction values for test #1 Top(1) prediction is 2 with value: 0.511024 Top(2) prediction is 0 with value: 2.7405e-07 Total time for 2 test cases: 0.205 seconds Average time per test case: 102.741 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
CaffeMnist-Armnn
CaffeMnist-Armnn example uses a Caffe model on top of ARM-NN for handwritten digits recognition. In this example, this is number 7.
You need to get the data and the model, so please install arm-ml-example:
As CaffeMnist-Armnn requires is slightly different naming, you need to rename the files:
cp -r /usr/share/armnn-mnist/* /tmp/ mv /tmp/data/t10k-labels-idx1-ubyte /tmp/data/t10k-labels.idx1-ubyte mv /tmp/data/t10k-images-idx3-ubyte /tmp/data/t10k-images.idx3-ubyte
And run CaffeMnist-Armnn --data-dir=/tmp/data/ --model-dir=/tmp/model/:
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 7 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #1 Top(1) prediction is 2 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #5 Top(1) prediction is 1 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #8 Top(1) prediction is 5 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #9 Top(1) prediction is 9 with value: 1 Top(2) prediction is 0 with value: 0 Total time for 5 test cases: 0.008 seconds Average time per test case: 1.569 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
MNIST Caffe example
MNIST Caffe example uses a Caffe model on top of ARM-NN for handwritten digits recognition. In this example, this is number 7.
You must install ARM ML examples and associated data from:
Go to the data folder:
cd /usr/share/armnn-mnist/
and run mnist_caffe:
Predicted: 7 Actual: 7
ONNX backend
OnnxMnist-Armnn
OnnxMnist-Armnn example uses an ONNX model on top of ARM-NN for handwritten digits recognition. In this example, this is number 7.
You need to get the data, so please install arm-ml-example:
And download the model from https://onnxzoo.blob.core.windows.net/models/opset_8/mnist/mnist.tar.gz
As OnnxMnist-Armnn requires is slightly different naming, you need to rename the files:
cp -r /usr/share/armnn-mnist/* /tmp/ mv /tmp/data/t10k-labels-idx1-ubyte /tmp/data/t10k-labels.idx1-ubyte mv /tmp/data/t10k-images-idx3-ubyte /tmp/data/t10k-images.idx3-ubyte
For the model:
wget https://onnxzoo.blob.core.windows.net/models/opset_8/mnist/mnist.tar.gz tar xzf mnist.tar.gz cp mnist/model.onnx /tmp/model/mnist_onnx.onnx
And run OnnxMnist-Armnn --data-dir=/tmp/data/ --model-dir=/tmp/model/ -i 1:
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 7 with value: 28.34 Top(2) prediction is 3 with value: 9.42895 Top(3) prediction is 2 with value: 8.64272 Top(4) prediction is 1 with value: 0.627583 Top(5) prediction is 0 with value: -1.25672 Total time for 1 test cases: 0.002 seconds Average time per test case: 2.278 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
OnnxMobileNet-Armnn
OnnxMobileNet-Armnn example uses an ONNX model on top of ARM-NN for image classification. In this example, it will look for shark, dog and cat.
You need to get the mobilenetv2 model for ONNX, so:
- download and extract https://s3.amazonaws.com/onnx-model-zoo/mobilenet/mobilenetv2-1.0/mobilenetv2-1.0.tar.gz
- Copy mobilenetv2-1.0/mobilenetv2-1.0.onnx to models/ folder
For the data, you need to download:
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
- an image of a Cat (Tiger cat), rename it Cat.jpg and place it to data/ folder.
- an image of a Dog (golden retriever), rename it Dog.jpg and place it to data/ folder.
And run OnnxMobileNet-Armnn --data-dir=data --model-dir=models -i 3:
ArmNN v20190800 Performance test running in DEBUG build - results may be inaccurate. = Prediction values for test #0 Top(1) prediction is 273 with value: 16.4625 Top(2) prediction is 227 with value: 13.9884 Top(3) prediction is 225 with value: 11.6609 Top(4) prediction is 168 with value: 11.3706 Top(5) prediction is 159 with value: 9.35255 = Prediction values for test #1 Top(1) prediction is 281 with value: 16.7145 Top(2) prediction is 272 with value: 5.43621 Top(3) prediction is 271 with value: 5.3766 Top(4) prediction is 51 with value: 5.24998 Top(5) prediction is 24 with value: 2.50436 = Prediction values for test #2 Top(1) prediction is 2 with value: 21.4471 Top(2) prediction is 0 with value: 4.55977 Total time for 3 test cases: 0.164 seconds Average time per test case: 54.651 ms Overall accuracy: 0.667 Runtime::UnloadNetwork(): Unloaded network with ID: 0
TensorFlow backend
TfInceptionV3-Armnn
TfInception_BN-Armnn example uses a TensorFlow model on top of ARM-NN for image classification. You need to get the data and the model, so please download:
- https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz and extract it to models/ folder.
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
- an image of a Cat (Tiger cat), rename it Cat.jpg and place it to data/ folder.
- an image of a Dog (golden retriever), rename it Dog.jpg and place it to data/ folder.
Now, you can run TfInceptionV3-Armnn --data-dir=data --model-dir=models :
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 208 with value: 0.918417 Top(2) prediction is 206 with value: 0.000891919 Top(3) prediction is 176 with value: 0.000658453 Top(4) prediction is 155 with value: 0.000206609 Top(5) prediction is 92 with value: 0.000192534 = Prediction values for test #1 Top(1) prediction is 283 with value: 0.544097 Top(2) prediction is 282 with value: 0.321364 Top(3) prediction is 198 with value: 0.000288878 Top(4) prediction is 179 with value: 0.000153869 Top(5) prediction is 146 with value: 0.000141289 = Prediction values for test #2 Top(1) prediction is 3 with value: 0.826077 Top(2) prediction is 0 with value: 0.000125644 Total time for 3 test cases: 0.365 seconds Average time per test case: 121.635 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
TfResNext-Armnn
TfResNext-Armnn example uses a TensorFlow model on top of ARM-NN for image classification. You need to get the data and the model, so please download:
- http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz and extract it in models/ folder
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
- an image of a Cat (Tiger cat), rename it Cat.jpg and place it to data/ folder.
- an image of a Dog (Labrador retriever), rename it Dog.jpg and place it to data/ folder.
And run TfResNext-Armnn --data-dir=data --model-dir=models :
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 209 with value: 0.856742 Top(2) prediction is 208 with value: 0.0588841 Top(3) prediction is 167 with value: 0.00553092 Top(4) prediction is 160 with value: 0.000479352 Top(5) prediction is 102 with value: 0.000265176 = Prediction values for test #1 Top(1) prediction is 283 with value: 0.344484 Top(2) prediction is 282 with value: 0.0748539 Top(3) prediction is 52 with value: 0.00447383 Top(4) prediction is 25 with value: 0.000883748 Top(5) prediction is 6 with value: 5.20586e-05 = Prediction values for test #2 Top(1) prediction is 3 with value: 0.588796 Top(2) prediction is 2 with value: 0.000818478 Top(3) prediction is 1 with value: 4.20274e-06 Top(4) prediction is 0 with value: 4.55538e-10 Total time for 3 test cases: 0.060 seconds Average time per test case: 19.954 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
TfMnist-Armnn
TfMnist-Armnn example uses a TensorFlow model on top of ARM-NN for handwritten digits recognition. In this example, this is number 7.
You need to get the data and the model, so please install arm-ml-example:
As TfMnist-Armnn requires is slightly different naming, you need to rename the files:
cp -r /usr/share/armnn-mnist/* /tmp/ mv /tmp/data/t10k-labels-idx1-ubyte /tmp/data/t10k-labels.idx1-ubyte mv /tmp/data/t10k-images-idx3-ubyte /tmp/data/t10k-images.idx3-ubyte
And run TfMnist-Armnn --data-dir=/tmp/data/ --model-dir=/tmp/model/:
ArmNN v20190800 = Prediction values for test #0 Top(1) prediction is 7 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #1 Top(1) prediction is 2 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #2 Top(1) prediction is 1 with value: 1 Top(2) prediction is 0 with value: 0 = Prediction values for test #3 Top(1) prediction is 0 with value: 1 = Prediction values for test #4 Top(1) prediction is 4 with value: 1 Top(2) prediction is 0 with value: 0 Total time for 5 test cases: 0.000 seconds Average time per test case: 0.045 ms Overall accuracy: 1.000 Runtime::UnloadNetwork(): Unloaded network with ID: 0
MNIST TensorFlow example
MNIST TensorFlow example uses a TensorFlow model on top of ARM-NN for handwritten digits recognition. In this example, this is number 7.
You must install ARM ML examples (and associated data) from:
Go to the data folder:
cd /usr/share/armnn-mnist/
and run mnist_tf:
Predicted: 7 Actual: 7
mnist-draw - Web app
MNIST Draw is single page website that enables users to hand-draw and classify digits between 0 and 9 using machine learning. A machine learning model trained against the MNIST dataset is used for classification.
The project is a modified version of mnist-draw, which uses the Arm NN SDK to perform inferences on an Arm Cortex-A CPU. The application runs on any ARM system and can be accessed over a network using a browser.
There is no RPM package yet, so you need to build it and run it manually.
Install dependencies:
zypper in armnn-devel python3-Pillow python3-numpy gcc gcc-c++ make
Compile from sources:
cd /tmp git clone https://github.com/ARM-software/Tool-Solutions/ cd Tool-Solutions/ml-tool-examples/mnist-draw make -C armnn-draw chmod a+w . -R # Fix permissions
If you want to use GpuAcc instead of CpuAcc, you can update cgi-bin/mnist.py by replacing mnist_tf_convol argument from:
completed = subprocess.run(['./armnn-draw/mnist_tf_convol', '1', '1', 'image.txt'], stderr=subprocess.PIPE, check=True)
to:
completed = subprocess.run(['./armnn-draw/mnist_tf_convol', '2', '1', 'image.txt'], stderr=subprocess.PIPE, check=True)
Run it:
python3 -m http.server --cgi 8080
And access it from your web browser, e.g: http://192.168.0.4:8080 if your board has IP 192.168.0.4.
TensorFlow Lite backend
To run TfLite*-Armnn examples, you need to download the models and extract them to models/ folder:
# Only the *.tflite files are needed, but more files are in the archives wget http://download.tensorflow.org/models/tflite/mnasnet_1.3_224_09_07_2018.tgz tar xzf mnasnet_*.tgz mv mnasnet_*/ models pushd models wget http://download.tensorflow.org/models/tflite_11_05_08/inception_v3_quant.tgz tar xzf inception_v3_quant.tgz wget http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz tar xzf mobilenet_v1_1.0_224_quant.tgz wget http://download.tensorflow.org/models/tflite_11_05_08/mobilenet_v2_1.0_224_quant.tgz tar xzf mobilenet_v2_1.0_224_quant.tgz popd
You may also get the labels from the MobileNet V1 archive: https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip
For the data, you need to download:
- an image of a shark (great white shark), rename it shark.jpg and place it to data/ folder.
- an image of a Cat (Tiger cat), rename it Cat.jpg and place it to data/ folder.
- an image of a Dog (golden retriever), rename it Dog.jpg and place it to data/ folder.
TfLiteInceptionV3Quantized-Armnn example
Once you have the models/ and data/ folders ready, you can run TfLiteInceptionV3Quantized-Armnn --data-dir=data --model-dir=models
TfLiteMnasNet-Armnn example
Once you have the models/ and data/ folders ready, you can run TfLiteMnasNet-Armnn --data-dir=data --model-dir=models
TfLiteMobilenetQuantized-Armnn example
Once you have the models/ and data/ folders ready, you can run TfLiteMobilenetQuantized-Armnn --data-dir=data --model-dir=models
TfLiteMobilenetV2Quantized-Armnn example
Once you have the models/ and data/ folders ready, you can run TfLiteMobilenetV2Quantized-Armnn --data-dir=data --model-dir=models
Additionnal (downstream) tests
Additional downstream tests are packaged separately, in armnn-extratests package.
TI ArmnnExamples
TI provides an additional test ArmnnExamples which allow to run models from all supported backends (Caffe, TensorFlow, TensorFlowLite, ONNX) on images, video (filter on .mp4, .mov and .avi, but you can change extension to workaround the filter and use .ogv, and others), and live stream from a webcam. More information on http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components_ArmNN.html#arm-nn-mobilenet-demo
You need to download a number of files (tests files from tidl-api git repo, mobilenet model, and mobilenet labels):
git clone git://git.ti.com/tidl/tidl-api.git # Can be skipped if you want to use your own test files wget http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz tar xzf *.tgz wget https://raw.githubusercontent.com/leferrad/tensorflow-mobilenet/master/imagenet/labels.txt sudo mkdir -p /usr/share/arm/armnn/models/ sudo cp labels.txt /usr/share/arm/armnn/models/ sudo chmod 666 /usr/share/arm/armnn/models/labels.txt # TO BE FIXED
Test with the baseball.jpg image from tidl-api/ :
ArmnnExamples -f tensorflow-binary -i input -s '1 224 224 3' -o MobilenetV1/Predictions/Reshape_1 -d ./tidl-api/examples/classification/images/baseball.jpg -m ./mobilenet_v1_1.0_224_frozen.pb -c CpuAcc --number_frame 10
Test with the test2.mp4 video clip from tidl-api/, it displays the video, with top match and FPS (requires h.264 decoder to be installed):
ArmnnExamples -f tensorflow-binary -i input -s '1 224 224 3' -o MobilenetV1/Predictions/Reshape_1 -d ./tidl-api/examples/classification/clips/test2.mp4 -m ./mobilenet_v1_1.0_224_frozen.pb -c CpuAcc --number_frame 100
You may also use warplane ogv file (just change the extension to .mp4)
Test with a live stream from a camera, it displays the video, with top match and FPS (camera_live_input0 is for /dev/video0, camera_live_input1 is for /dev/video1, etc.):
ArmnnExamples -f tensorflow-binary -i input -s '1 224 224 3' -o MobilenetV1/Predictions/Reshape_1 -d camera_live_input0 -m ./mobilenet_v1_1.0_224_frozen.pb -c CpuAcc --number_frame 100