Towards Better Expected Generation of Diffusion Models
Access status:
Open Access
Type
ThesisThesis type
Masters by ResearchAuthor/s
Ren, ZhiyaoAbstract
Diffusion Models have achieved remarkable success in generative tasks; however, their generated results can still deviate from people's expectations, impacting the user experience. These issues may stem from two unresolved problems encountered during the training and application ...
See moreDiffusion Models have achieved remarkable success in generative tasks; however, their generated results can still deviate from people's expectations, impacting the user experience. These issues may stem from two unresolved problems encountered during the training and application stages. During the training stage, there is a discrepancy between training and inference generation process, known as the exposure bias issue. This issue may affects the quality of the expected generation. During the application stage, the current use of Diffusion Models typically relies on textual prompt guidance; however, manually designing these prompts is complex and time-consuming. Users often struggle to accurately describe their ideas with prompts, leading to results that do not meet expectations. In this thesis, we primarily focus on the task of image generation, aiming to enable Diffusion Models to generate more expected results by 1) alleviating exposure bias problem and 2) decoding textual prompts from existing reference images to help people design better prompts. Our research significantly improves the quality of expected image generation in Diffusion Models and provides new insights for future research.
See less
See moreDiffusion Models have achieved remarkable success in generative tasks; however, their generated results can still deviate from people's expectations, impacting the user experience. These issues may stem from two unresolved problems encountered during the training and application stages. During the training stage, there is a discrepancy between training and inference generation process, known as the exposure bias issue. This issue may affects the quality of the expected generation. During the application stage, the current use of Diffusion Models typically relies on textual prompt guidance; however, manually designing these prompts is complex and time-consuming. Users often struggle to accurately describe their ideas with prompts, leading to results that do not meet expectations. In this thesis, we primarily focus on the task of image generation, aiming to enable Diffusion Models to generate more expected results by 1) alleviating exposure bias problem and 2) decoding textual prompts from existing reference images to help people design better prompts. Our research significantly improves the quality of expected image generation in Diffusion Models and provides new insights for future research.
See less
Date
2024Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare