Design Efficient Deep Neural Networks with System Optimization

Zhang, Zao

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Zhang, Zao
dc.date.accessioned	2024-06-07T06:08:43Z
dc.date.available	2024-06-07T06:08:43Z
dc.date.issued	2024	en
dc.identifier.uri	https://hdl.handle.net/2123/32642
dc.description	Includes publication
dc.description.abstract	The pursuit of enhanced accuracy in Deep Neural Networks (DNNs) has led to increasingly complex model structures, notably in Convolutional Neural Networks (CNNs) and Transformers. While these advancements have propelled the capabilities of intelligent applications, they also introduce significant challenges, primarily an increase in inference latency.This issue is particularly critical in time-sensitive applications, such as self-driving vehicles, where delays could have severe consequences. In addressing these challenges, this dissertation focuses on optimizing DNNs for efficiency with the goal of maintaining or minimally impacting their accuracy. The study is structured into five chapters, each targeting a specific aspect of DNN optimization in the context of CNNs and Transformers. (1) Efficient Model Design for CNNs (2) System Optimization for CNNs (3) Efficient Model Design for Transformers (4) System Optimization for Transformers (5) Model Compression Methods. Throughout those studies, the emphasis is placed not only on the technical advancements in DNN efficiency but also on the broader implications of these improvements. The research highlights how optimizing DNNs can lead to significant benefits in real-world applications, particularly those requiring real-time processing and operating under resource constraints. By advancing the field of DNN efficiency, this work contributes to the development of more sustainable, accessible, and powerful AI technologies, reinforcing the role of DNNs in the future of intelligent systems.	en
dc.language.iso	en	en
dc.rights	The author retains copyright of this thesis
dc.subject	Efficient Deep Neural Networks	en
dc.subject	AI inference acceleration	en
dc.subject	model compression	en
dc.subject	system optimization	en
dc.title	Design Efficient Deep Neural Networks with System Optimization	en
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Electrical and Information Engineering	en
usyd.degree	Doctor of Philosophy Ph.D.	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Yuan, Dong
usyd.include.pub	Yes	en