Towards Generalizable Deep Image Matting: Decomposition, Interaction, and Merging
Field | Value | Language |
dc.contributor.author | Li, Jizhizi | |
dc.date.accessioned | 2023-08-29T06:05:47Z | |
dc.date.available | 2023-08-29T06:05:47Z | |
dc.date.issued | 2023 | en_AU |
dc.identifier.uri | https://hdl.handle.net/2123/31619 | |
dc.description.abstract | Image matting refers to extracting the precise alpha mattes from images, playing a critical role in many downstream applications. Despite extensive attention, key challenges persist and motivate the research presented in this thesis. One major challenge is the reliance of auxiliary inputs in previous methods, hindering real-time practicality. To address this, we introduce fully automatic image matting by decomposing the task into high-level semantic segmentation and low-level details matting. We then incorporate plug-in modules to enhance the interaction between the sub-tasks through feature integration. Furthermore, we propose an attention-based mechanism to guide the matting process through collaboration merging. Another challenge lies in limited matting datasets, resulting in reliance on composite images and inferior performance on images in the wild. In response, our research proposes a composition route to mitigate the discrepancies and result in remarkable generalization ability. Additionally, we construct numerous large datasets of high-quality real-world images with manually labeled alpha mattes, providing a solid foundation for training and evaluation. Moreover, our research uncovers new observations that warrant further investigation. Firstly, we systematically analyze and address privacy issues that have been neglected in previous portrait matting research. Secondly, we explore the adaptation of automatic matting methods to non-salient or transparent categories beyond salient ones. Furthermore, we collaborate with language modality to achieve a more controllable matting process, enabling specific target selection at a low cost. To validate our studies, we conduct extensive experiments and provide all codes and datasets through the link (https://github.com/JizhiziLi/). We believe that the analyses, methods, and datasets presented in this thesis will offer valuable insights for future research endeavors in the field of image matting. | en_AU |
dc.language.iso | en | en_AU |
dc.subject | image matting | en_AU |
dc.subject | semantic segmentation | en_AU |
dc.subject | cross-modal referring | en_AU |
dc.subject | deep learning | en_AU |
dc.subject | composition | en_AU |
dc.title | Towards Generalizable Deep Image Matting: Decomposition, Interaction, and Merging | en_AU |
dc.type | Thesis | |
dc.type.thesis | Doctor of Philosophy | en_AU |
dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en_AU |
usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Computer Science | en_AU |
usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
usyd.awardinginst | The University of Sydney | en_AU |
usyd.advisor | Tao, Dacheng |
Associated file/s
Associated collections