Design Efficient Deep Neural Networks with System Optimization
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Zhang, ZaoAbstract
The pursuit of enhanced accuracy in Deep Neural Networks (DNNs) has led to increasingly complex model structures, notably in Convolutional Neural Networks (CNNs) and Transformers. While these advancements have propelled the capabilities of intelligent applications, they also introduce ...
See moreThe pursuit of enhanced accuracy in Deep Neural Networks (DNNs) has led to increasingly complex model structures, notably in Convolutional Neural Networks (CNNs) and Transformers. While these advancements have propelled the capabilities of intelligent applications, they also introduce significant challenges, primarily an increase in inference latency.This issue is particularly critical in time-sensitive applications, such as self-driving vehicles, where delays could have severe consequences. In addressing these challenges, this dissertation focuses on optimizing DNNs for efficiency with the goal of maintaining or minimally impacting their accuracy. The study is structured into five chapters, each targeting a specific aspect of DNN optimization in the context of CNNs and Transformers. (1) Efficient Model Design for CNNs (2) System Optimization for CNNs (3) Efficient Model Design for Transformers (4) System Optimization for Transformers (5) Model Compression Methods. Throughout those studies, the emphasis is placed not only on the technical advancements in DNN efficiency but also on the broader implications of these improvements. The research highlights how optimizing DNNs can lead to significant benefits in real-world applications, particularly those requiring real-time processing and operating under resource constraints. By advancing the field of DNN efficiency, this work contributes to the development of more sustainable, accessible, and powerful AI technologies, reinforcing the role of DNNs in the future of intelligent systems.
See less
See moreThe pursuit of enhanced accuracy in Deep Neural Networks (DNNs) has led to increasingly complex model structures, notably in Convolutional Neural Networks (CNNs) and Transformers. While these advancements have propelled the capabilities of intelligent applications, they also introduce significant challenges, primarily an increase in inference latency.This issue is particularly critical in time-sensitive applications, such as self-driving vehicles, where delays could have severe consequences. In addressing these challenges, this dissertation focuses on optimizing DNNs for efficiency with the goal of maintaining or minimally impacting their accuracy. The study is structured into five chapters, each targeting a specific aspect of DNN optimization in the context of CNNs and Transformers. (1) Efficient Model Design for CNNs (2) System Optimization for CNNs (3) Efficient Model Design for Transformers (4) System Optimization for Transformers (5) Model Compression Methods. Throughout those studies, the emphasis is placed not only on the technical advancements in DNN efficiency but also on the broader implications of these improvements. The research highlights how optimizing DNNs can lead to significant benefits in real-world applications, particularly those requiring real-time processing and operating under resource constraints. By advancing the field of DNN efficiency, this work contributes to the development of more sustainable, accessible, and powerful AI technologies, reinforcing the role of DNNs in the future of intelligent systems.
See less
Date
2024Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Electrical and Information EngineeringAwarding institution
The University of SydneyShare