FPGA Architectures for Low Precision Machine Learning
Field | Value | Language |
dc.contributor.author | Moss, Duncan J.M. | |
dc.date.accessioned | 2018-05-15 | |
dc.date.available | 2018-05-15 | |
dc.date.issued | 2017-12-31 | |
dc.identifier.uri | http://hdl.handle.net/2123/18182 | |
dc.description.abstract | Machine learning is fast becoming a cornerstone in many data analytic, image processing and scientific computing applications. Depending on the deployment scale, these tasks can either be performed on embedded devices, or larger cloud computing platforms. However, one key trend is an exponential increase in the required compute power as data is collected and processed at a previously unprecedented scale. In an effort to reduce the computational complexity there has been significant work on reduced precision representations. Unlike Central Processing Units, Graphical Processing Units and Applications Specific Integrated Circuits which have fixed datapaths, Field Programmable Gate Arrays (FPGA) are flexible and uniquely positioned to take advantage of reduced precision representations. This thesis presents FPGA architectures for low precision machine learning algorithms, considering three distinct levels: the application, the framework and the operator. Firstly, a spectral anomaly detection application is presented, designed for low latency and real-time processing of radio signals. Two types of detector are explored, a neural network autoencoder and least squares bitmap detector. Secondly, a generalised matrix multiplication framework for the Intel HARPv2 is outlined. The framework was designed specifically for machine learning applications; containing runtime configurable optimisations for reduced precision deep learning. Finally, a new machine learning specific operator is presented. A bit-dependent multiplication algorithm designed to conditionally add only the relevant parts of the operands and arbitrarily skip over redundant computation. Demonstrating optimisations on all three levels; the application, the framework and the operator, illustrates that FPGAs can achieve state-of-the-art performance in important machine learning workloads where high performance is critical; while simultaneously reducing implementation complexity. | en_AU |
dc.rights | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en_AU |
dc.subject | FPGA | en_AU |
dc.subject | Machine Learning | en_AU |
dc.subject | Low Precision | en_AU |
dc.subject | Reconfigurable Hardware | en_AU |
dc.title | FPGA Architectures for Low Precision Machine Learning | en_AU |
dc.type | Thesis | en_AU |
dc.type.thesis | Doctor of Philosophy | en_AU |
usyd.faculty | Faculty of Engineering and Information Technologies, School of Electrical and Information Engineering | en_AU |
usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
usyd.awardinginst | The University of Sydney | en_AU |
Associated file/s
Associated collections