3D computer vision and visual data analysis

Yu, Jianhui

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Yu, Jianhui
dc.date.accessioned	2024-03-05T02:53:19Z
dc.date.available	2024-03-05T02:53:19Z
dc.date.issued	2024	en_AU
dc.identifier.uri	https://hdl.handle.net/2123/32300
dc.description	Includes publication
dc.description.abstract	This thesis delves into the domain of 3D data analysis, an area of immense significance in fields including computer graphics, virtual reality, and medical imaging. While 2D data has been extensively studied in computer vision, 3D data introduces an additional layer of complexity, either due to an added spatial dimension or a temporal aspect in video data. This research focuses on three forms of 3D data: point clouds, human meshes, and face videos. In point cloud analysis, we focus on key tasks including classification, segmentation, and semantic segmentation. We first investigate medical point clouds, where we propose a transformer-based model with a novel attention mechanism and a graph reasoning module for classification and segmentation tasks. We also introduce a method for rotation-invariant feature learning, improving analysis robustness and computational efficiency. Moving to 3D human modeling, our work explores text-guided human texture generation. Traditional 3D modeling techniques often fall short in capturing the nuanced textural details of human models. We use a deep learning framework, combining diffusion generative models with physically based rendering and a 3D coordinate network. This method generates high-quality textures and ensures they align semantically with input texts. In the realm of face video data, we begin by proposing a generative adversarial network pipeline for synthesizing faces and predicting micro-expression labels. We also introduce a large-scale face video dataset, complete with textual descriptions, and present a novel text-to-face generation model using bidirectional transformers and an innovative video token technique. Our experiments demonstrate both the superiority of our method and the high-quality face dataset. Overall, this thesis contributes significantly to 3D data processing, showing great potential in point cloud analysis, 3D human modeling, and face video processing, promising research and practical advancements.	en_AU
dc.language.iso	en	en_AU
dc.subject	3D data analysis	en_AU
dc.subject	generative modeling	en_AU
dc.subject	point clouds	en_AU
dc.title	3D computer vision and visual data analysis	en_AU
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en_AU
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en_AU
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en_AU
usyd.degree	Doctor of Philosophy Ph.D.	en_AU
usyd.awardinginst	The University of Sydney	en_AU
usyd.advisor	Cai, Weidong
usyd.include.pub	Yes	en_AU