Specialised Architectures and Arithmetic for Machine Learning
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Fox, SeanAbstract
Machine learning has risen to prominence in recent years thanks to advancements in computer technology, the abundance of data, and numerous breakthroughs in a broad range of applications. Unfortunately, as the demand for machine learning has grown, so too has the amount of computation ...
See moreMachine learning has risen to prominence in recent years thanks to advancements in computer technology, the abundance of data, and numerous breakthroughs in a broad range of applications. Unfortunately, as the demand for machine learning has grown, so too has the amount of computation required for training. Combine this trend with declines observed in performance scaling of standard computer architectures, and it has become increasingly difficult to support machine learning training at increased speed and scale, especially in embedded devices which are smaller and have stricter constraints. Research points towards the development of purpose-built hardware accelerators to overcome the computing challenge, and this thesis explains how specialised hardware architectures and specialised computer arithmetic can achieve performance not possible with standard technology, e.g. Graphics Processing Units (GPUs) and floating-point arithmetic. Based on the implementation of kernel methods and deep neural network (DNN) algorithms using Field Programmable Gate Arrays (FPGAs), this thesis shows how specialised arithmetic is crucial for accurately training large models with less memory, while specialised architectures are needed to increase computational parallelism and reduce off-chip memory transfers. These outcomes are an important step towards moving more machine intelligence into e.g. mobile phones, video cameras, radios, and satellites.
See less
See moreMachine learning has risen to prominence in recent years thanks to advancements in computer technology, the abundance of data, and numerous breakthroughs in a broad range of applications. Unfortunately, as the demand for machine learning has grown, so too has the amount of computation required for training. Combine this trend with declines observed in performance scaling of standard computer architectures, and it has become increasingly difficult to support machine learning training at increased speed and scale, especially in embedded devices which are smaller and have stricter constraints. Research points towards the development of purpose-built hardware accelerators to overcome the computing challenge, and this thesis explains how specialised hardware architectures and specialised computer arithmetic can achieve performance not possible with standard technology, e.g. Graphics Processing Units (GPUs) and floating-point arithmetic. Based on the implementation of kernel methods and deep neural network (DNN) algorithms using Field Programmable Gate Arrays (FPGAs), this thesis shows how specialised arithmetic is crucial for accurately training large models with less memory, while specialised architectures are needed to increase computational parallelism and reduce off-chip memory transfers. These outcomes are an important step towards moving more machine intelligence into e.g. mobile phones, video cameras, radios, and satellites.
See less
Date
2021Licence
The author retains copyright of this thesisRights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Electrical and Information EngineeringAwarding institution
The University of SydneyShare