Permutation-invariant Representation Learning for 3D Point Cloud Processing
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Wen, ChengAbstract
Point clouds are becoming increasingly popular as the first-hand data captured by RGB-D cameras
or LiDARs. They have been widely applied in various applications, such as scene reconstruction,
autonomous driving, and virtual reality. This type of data format preserves abundant ...
See morePoint clouds are becoming increasingly popular as the first-hand data captured by RGB-D cameras or LiDARs. They have been widely applied in various applications, such as scene reconstruction, autonomous driving, and virtual reality. This type of data format preserves abundant information, which is invaluable for AI-driven applications that have revolutionized our perception and interpretation of the world in the era of deep learning. However, point clouds are not well-organized on a regular grid, highlighting the substantial challenge posed by the irregularity and disorder in their processing. Therefore, the primary objective of 3D computer vision is to apply neural networks to point clouds efficiently. A point cloud consists of a collection of unordered points, which means that any network consuming 3D point sets should be invariant to the permutations of the input set order. Therefore, this thesis focuses on learning robust representations for point clouds that are resilient to such permutations. In addition, as each point in the set is not isolated from others, it is essential for the neural networks to capture the detailed relationships between neighboring points. This thesis introduces both spatialbased and spectral-based methods for point cloud embedding in high-dimensional feature space. Due to the remarkable feature learning ability and permutation-equivariant operation, novel transformer-based network architectures in both domains are proposed to fully explore the feature dependencies. From another perspective, this thesis proposes diverse task-specific strategies to deal with the permutable nature of point clouds and advance the performance, such as skeleton-aware guidance for point cloud sampling, spectral learning for classification/segmentation, and progressive embedding for point cloud generation.
See less
See morePoint clouds are becoming increasingly popular as the first-hand data captured by RGB-D cameras or LiDARs. They have been widely applied in various applications, such as scene reconstruction, autonomous driving, and virtual reality. This type of data format preserves abundant information, which is invaluable for AI-driven applications that have revolutionized our perception and interpretation of the world in the era of deep learning. However, point clouds are not well-organized on a regular grid, highlighting the substantial challenge posed by the irregularity and disorder in their processing. Therefore, the primary objective of 3D computer vision is to apply neural networks to point clouds efficiently. A point cloud consists of a collection of unordered points, which means that any network consuming 3D point sets should be invariant to the permutations of the input set order. Therefore, this thesis focuses on learning robust representations for point clouds that are resilient to such permutations. In addition, as each point in the set is not isolated from others, it is essential for the neural networks to capture the detailed relationships between neighboring points. This thesis introduces both spatialbased and spectral-based methods for point cloud embedding in high-dimensional feature space. Due to the remarkable feature learning ability and permutation-equivariant operation, novel transformer-based network architectures in both domains are proposed to fully explore the feature dependencies. From another perspective, this thesis proposes diverse task-specific strategies to deal with the permutable nature of point clouds and advance the performance, such as skeleton-aware guidance for point cloud sampling, spectral learning for classification/segmentation, and progressive embedding for point cloud generation.
See less
Date
2024Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare