Towards controllable neural audio synthesis: An exploration of sound effects creation with generative models

Liu, Yunyi

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Liu, Yunyi
dc.date.accessioned	2025-10-21T01:07:28Z
dc.date.available	2025-10-21T01:07:28Z
dc.date.issued	2025	en
dc.identifier.uri	https://hdl.handle.net/2123/34420
dc.description.abstract	Sound effects are crucial components in enhancing the auditory experience in media such as film, television, and video games. Traditionally, these effects are created using two primary methods: Foley recording, where sound is physically performed to match onscreen actions, and digital sound processing (DSP), which manipulates audio signals with algorithms to achieve desired sounds. While effective, these methods come with limitations in terms of scalability and the ability to rapidly prototype sounds. Recent advancements in deep learning have introduced a new paradigm for sound effects generation through neural audio synthesis. This approach utilizes generative models to produce sound effects from learned audio features automatically. However, a significant drawback of this technology is the lack of fine-grained control over the sound output, as generative models typically provide fewer parameter adjustments compared to traditional DSP methods. This limitation poses challenges in achieving specific auditory outcomes necessary for creative sound design. This thesis explores neural audio synthesis in sound effects generation. The data scarcity in the domain of sound effects has been a well-known issue, which makes it challenging for data-driven approaches towards learning and synthesizing. This thesis aims to study controllable and diverse sound effects generation under the setting of a limited audio dataset without requiring huge computation resources.	en
dc.language.iso	en	en
dc.subject	Neural audio synthesis	en
dc.subject	sound effects	en
dc.subject	generative models	en
dc.subject	audio processing	en
dc.title	Towards controllable neural audio synthesis: An exploration of sound effects creation with generative models	en
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering	en
usyd.degree	Doctor of Philosophy Ph.D.	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Jin, Craig
usyd.include.pub	No	en