bfloat16nn - an FPGA project add-on#
This project demonstrates the use of half-precision floats on an FPGA, dubbed 'bfloat16nn'. The project requires a colorlight-5a-75b board. A RISC-V CPU (RV32I) is incorporated. The project also makes use of LiteDRAM DMA capabilities. The project is assumed to be used in conjunction w/ the Risq5 project (repository avail. on this server as well) on a somewhat larger FPGA at some later time.
(Hint: project has been tested on Linux Mint 20 only, but should run on other Linux versions as well ...)
To use this project effectively, you will have to install LiteX, see https://github.com/enjoy-digital/litex for details (and project Trellis, NextPNR & YoSys requirements). Also, it is recommended to install the board support, see https://github.com/litex-hub/litex-boards, as well as the the RISC-V tool chain (see https://github.com/sifive/freedom-tools/releases). To communicate with your board via network, install the wishbone tools, see https://github.com/litex-hub/wishbone-utils.
To use the automatic documentation feature, you will have to install sphinx, see https://www.sphinx-doc.org/en/master. Also its wavedrom extension has to be installed, see https://pypi.org/project/wavedrom. Some helpful links for RST docstring formats: http://daouzli.com/blog/docstring.html & https://thomas-cokelaer.info/tutorials/sphinx/rest_syntax.html
The project assumes a local 'fpga' path within the home directory of the user, where all the above mentioned software packages are installed. Furthermore, the project assumes a virtual environment named 'fpga' where all project relevant python libs are registered (this is not strictly necessary ... maybe software/ramcreate.sh has to be adjusted, as well as the python interpreter settings within VSC!). The actual project may be installed anywhere, but local paths will have to be adjusted (firmware/main.c, software/ramcreate.sh ... worx for me ;).
A JTAG programmer will be required for successful device programming. Thanx to Wolfgang, I'm using the Versaloon (s/w for blue-pill STM32), see https://github.com/zoobab/versaloon. To use this device, you also will have to install openocd via 'apt install openocd'. See https://git.hacknology.de/wolfgang/colorlight#user-content-class-hub75sender for details, on how to connect the JTAG adapter.
For board specific details see
https://github.com/enjoy-digital/colorlite/blob/master. Other helpful links to board data:
- bfloat16nn.py - this is the main FPGA building source
- start_terminal_service.sh - once the FPGA has been loaded, this will prepare the terminal service
- libmodules subdir - contains the actual bfloat16 processing units, DRAM DMA helpers & system time support
- helpers subdir - contains python helpers for load & flash etc. (not used here)
- firmware subdir - contains some modified BIOS files (relative to the original version)
- software subdir - contains a separate build, load & flash logic for separate (RV32i) application code
(the rest is of minor importance ...)
After installation of the relevant toolchains:
- Open the project in VSC (or use your favourite IDE & maybe adjust some settings ;), adjust local paths if nec. ...
- Connect your JTAG adapter as described in Wolfgang's documentation @ https://git.hacknology.de/wolfgang/colorlight
- Run bfloat16nn.py with these options (you may omit the --doc option if there is no Sphinx installed): --build --load --revision=7.0 --uart-name=crossover --with-etherbone --ip-address=192.168.1.20 --csr-csv=build/csr.csv --doc to create & load the project to on-board SRAM via the USB/JTAG-Adapter (this takes it's time ...)
Individual (separate) applications
- This time, open up a terminal & cd to the project local 'software' subdirectory
- You can load an application to RAM bank 1:
./ramcreate.sh main bfloat16nnlib 1
- To run the (now) RAM based application, type 'cd ..' within terminal
- Connect the Litex-Terminal to the board via:
- Type 'ramboot' into terminal, the RAM based application should come up now
- You can load an application to RAM bank 2:
./ramcreate.sh main bfloat16nnlib 2
- Now, use 'ramboot' again! The system should swap to RAM bank #2 and boot the application right away
- This is the testing loop, once your happy w/ your application, it needs to be flashed