Towards controllable neural audio synthesis: An exploration of sound effects creation with generative models
| Field | Value | Language |
| dc.contributor.author | Liu, Yunyi | |
| dc.date.accessioned | 2025-10-21T01:07:28Z | |
| dc.date.available | 2025-10-21T01:07:28Z | |
| dc.date.issued | 2025 | en |
| dc.identifier.uri | https://hdl.handle.net/2123/34420 | |
| dc.description.abstract | Sound effects are crucial components in enhancing the auditory experience in media such as film, television, and video games. Traditionally, these effects are created using two primary methods: Foley recording, where sound is physically performed to match onscreen actions, and digital sound processing (DSP), which manipulates audio signals with algorithms to achieve desired sounds. While effective, these methods come with limitations in terms of scalability and the ability to rapidly prototype sounds. Recent advancements in deep learning have introduced a new paradigm for sound effects generation through neural audio synthesis. This approach utilizes generative models to produce sound effects from learned audio features automatically. However, a significant drawback of this technology is the lack of fine-grained control over the sound output, as generative models typically provide fewer parameter adjustments compared to traditional DSP methods. This limitation poses challenges in achieving specific auditory outcomes necessary for creative sound design. This thesis explores neural audio synthesis in sound effects generation. The data scarcity in the domain of sound effects has been a well-known issue, which makes it challenging for data-driven approaches towards learning and synthesizing. This thesis aims to study controllable and diverse sound effects generation under the setting of a limited audio dataset without requiring huge computation resources. | en |
| dc.language.iso | en | en |
| dc.subject | Neural audio synthesis | en |
| dc.subject | sound effects | en |
| dc.subject | generative models | en |
| dc.subject | audio processing | en |
| dc.title | Towards controllable neural audio synthesis: An exploration of sound effects creation with generative models | en |
| dc.type | Thesis | |
| dc.type.thesis | Doctor of Philosophy | en |
| dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en |
| usyd.faculty | SeS faculties schools::Faculty of Engineering | en |
| usyd.degree | Doctor of Philosophy Ph.D. | en |
| usyd.awardinginst | The University of Sydney | en |
| usyd.advisor | Jin, Craig | |
| usyd.include.pub | No | en |
Associated file/s
Associated collections