Machine Learning (ML) is a hot topic, finding many use cases and applications. Heterogeneous SoC like the Zynq and Zynq MPSoC provide a significant advantage as they allow the inference network to be implemented within programmable logic.
Implementing the inference network in the PL provides a significantly increase in performance. Of course for those unfamiliar with the Machine Learning it can be difficult to know where to start, especially if you want to accelerate performance with using programmable logic.
This where the Pynq Framework comes in and allows us to work with higher level languages such as Python while accessing programmable logic overlays to perform the ML acceleration.
For this project we are going use Quantized / Binary Neural Network overlays available for the Pynq Z2, Z1 and Ultra96.
In this project we are going to install the BNN overlays and run through one of the examples to demonstrate correct functionality.
The main thrust of this project however will be the training and application of new parameters. We can then apply these new parameters to the overlay.
For this project we will be using Pynq on the Arty Z7 / Pynq Z1
Once the Pynq image has booted, connect to the Pynq using a web browser at the address http://pynq:9090
If a password is requested enter "xilinx"
To install the BNN we need to use a terminal window, we can open a new Terminal within the browser by selecting new -> terminal
Once the terminal is open we can install the BNN overlays and examples by using the command. For this example we are going to use a Fork of the Xilinx BNN repository from NTNU, this fork shows nicely how a new network can be trained.
sudo pip3.6 install git+https://github.com/maltanar/BNN-PYNQ.git
This will take a few seconds to download and install.
Once completed on the Pynq home page you will see a new BNN folder, under which will be several new notebooks.
In this directory you will see several notebooks. These notebooks use one of two overlays
- LFC - Fully Connected network designed for black and white operations on a 28 by 28 input
- CNV - Convolution network designed for RGB operations 32 by 32 input
The structures of both can be seen below.
With the BNN installed the next step is to run one (or more if you desire) of the examples to ensure the installation is functional.
For this example I decided to run the Road-Signs-Batch notebook. This note book uses the convolution network to classify road signs.
Initially this notebook works on small images composed of only one sign, later tests use a large image which contains one sign along with other imagery. The algorithm in this case detects the sign and classifies it.
The first pass of the algorithm results in several potential candidates for the sign as indicated below.
Applying a threshold to this initial image results in the final stop sign being correctly identified.
Of course we can use the provided in our application however, we may need to use a different network. As such we need the ability to train our own network and apply it to the overlay in the Pynq.
To create our own network we need several things, one of the first things we need is a set of training data which is correctly labeled. For this example we are going to train the neural network to be able to identify articles of clothing with the fashion mnist data set.
The most important thing when we build a new network for an overlay is to ensure network we train is identical to the one on the overlay we wish to use.
To support this the Xilinx BNN GitHub provides a training directory with a number of python scripts that can be used to create new networks, with many being able to act as templates.
Within the BNN github under the directory BNN->SRC->Training you will find a number of scripts which can help train new networks
- lfc.py - This describes the LFC network structure
- cnv.py - This describes the CNV network strucutre
- binary_net.py - Contains a number of functions to help in training
- finnthesizer.py - Performs the conversion into binary format
- mnist.py - Trains a LFC network for the mnist character recognition - good template for LFC networks
- cifar10.py - Trains a CNV network for the cifar image characterization network - good template for CNV network
For this example we will use the fashion-mnist.py which is an adaption of the mnist.py to train a LFC network to detect and classify objects of clothing.
To perform this training we need the following
- Either a AWS Instance or high end GPU
Once we have decided on a training environment we need to ensure we set it up correctly. The first thing we need to is to set up the SW environment, we need to ensure the following are installed.
- Python - Including both NumPy and SciPy
- Theano - Python Library for working with multi dimensional arrays
- PyLearn2 - Python library for machine learning
- Lasange - Python library for building and training neural networks
I used the commands below on my GPU.
sudo apt-get install git python-dev libopenblas-devliblapack-dev gfortran -y wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py --user pip install --usergit+https://github.com/Theano/Theano.firstname.lastname@example.org pip install --user https://github.com/Lasagne/Lasagne/archive/master.zip pip install --user numpy==1.11.0 git clone https://github.com/lisa-lab/pylearn2$ cd pylearn2 python setup.py develop --user
Once we have the software enviroment installed the nest stage is to download the training images and labels
We can download these using the commands below.
wget -nc http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz; gunzip -f train-images-idx3-ubyte.gz wget -nc http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz; gunzip -f train-labels-idx1-ubyte.gz wget -nc http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz; gunzip -f t10k-images-idx3-ubyte.gz wget -nc http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz; gunzip -f t10k-labels-idx1-ubyte.gz
To perform the training I also uploaded the contents of the training directory to the GPU instance I was using.
With that we are ready to start the training.
This as they say will take some time
Once the training has completed, we will be in possession of the trained network within a npz file.
To use this file we need to convert the values into the correct format for the overlay. To do this we need to transfer the npz files onto the Pynq, for this I used WinSCP.
I also uploaded the binary_net, fashino-mnist-gen-binary-weights and finnthesizer python scripts.
On the Pynq before we can run our new network we need to convert the weights to a binary format.
We do this by running the python script below
This will create a new directory which contains all of the weights for the overlay.
Once this is completed we can then start with the creation of our own note book.
I created this under the BNN area this note book does the following
Sets the root and parameter directories - within the parameter directory you will find all of the network parameters for the different trained networks.
The first thing we need to do is transfer the generated weights for the fashion mnist into the param directory. We only need to do this the first time we run this script however.
Once the parameters are loaded into the params directory we want to check they are accessible by using the available_params function for the LFC network.
If correctly installed this should also show the fashion-mnist-lfc along with the initial two networks.
The final stage is to run the inference, with the parameters. For this we need to load in a image, convert it into a mnist image format and apply it to the network.
We can then output the result of the inference and show the image to see if the prediction was correct.
As you can see it correctly identified a pair of trousers from the image input.
To gain an idea to the total accuracy of the network we can download and run through 10K tagged images.
We can then batch process the images and calculate the overall accuracy of the network.
This equates to an accuracy of just under 85% at 84.87% performing a search online comparing other fashion mnist accuracy shows this is slightly lower than other implementations which are in the range of 88-92% depending upon the network.
You can find the files associated with this project here:
See previous projects here.
More on on Xilinx using FPGA development weekly at MicroZed Chronicles.