Research with Tiny ML. Quantized aware training for custom hardware in embedded FPGA

Dear Comunity, :grinning:

I am working on a project to accelerate tensor operations on embedded FPGA with approximate computing techniques. As always, the entire work is available as an open-source project to facilitate research in this field. :hugs:

For this project, TensorFlow Lite Micro is deployed on Zybo Z7 (Zynq-7020). On the programmable logic, it is implemented a hardware tensor processor to delegate Conv2D and DepwiseConv2D operations. This design accelerates computation, reduces hardware resources and energy consumption.

Here you find a research poster on this project: Accelerating Tiny ML on embedded FPGA - Google Drive

This approach utilizes custom floating-point and logarithmic number representation on the filter and bias tensors.

Hence, a central point that can be optimized in this work is the accuracy with quantized aware training methods using TensorFlow. :thinking: (Any help?)

If you are interested in contributing to this project in any sense please let me know :D. The progress will be published in design conferences and journals.


-Yarib Nevarez