Robust, Efficient, and Synergetic 3D Perception for Autonomous Driving
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Zhao, HaimeiAbstract
3D perception is crucial for autonomous systems, particularly in autonomous driving, where understanding complex environments is essential. Despite advancements, significant challenges remain in model robustness, efficiency, and multi-task coordination. This thesis addresses these ...
See more3D perception is crucial for autonomous systems, particularly in autonomous driving, where understanding complex environments is essential. Despite advancements, significant challenges remain in model robustness, efficiency, and multi-task coordination. This thesis addresses these challenges to enhance 3D perception capabilities for autonomous driving. This thesis begins with an innovative self-supervised depth estimation framework, characterized by two proposed robust cross-view consistency losses. By advancing the previous cross-view alignment paradigm, this framework bolsters the system’s resilience to dynamic scenes and occlusions, laying a robust foundation for subsequent tasks. Following the acquisition of point-level geometry through the depth estimation task, this thesis delves deeper into the robust perception of point-level semantics within the LiDAR semantic segmentation task, especially in adverse weather conditions. UniMix, a universal method designed for LiDAR semantic segmentation models, fosters adaptability and robustness across diverse weather conditions via unsupervised domain adaptation and domain generalization techniques. Moving forward, this thesis addresses challenges in object-level geometry and semantics perception within the 3D object detection task while alleviating the model efficiency issue. SimDistill, an efficient simulated multi-modal distillation methodology for 3D object detection, harnesses the power of knowledge distillation techniques to reduce computational complexity and ensure cost-effective deployment while maintaining high performance. Lastly, this thesis proposes a holistic approach leveraging the complementary information among diverse 3D perception tasks to amplify robustness, efficiency, and versatility. JPerceiver is a synergetic joint perception framework tailored for scale-aware depth estimation, visual odometry, and BEV layout estimation, achieves promising performance in both accuracy and efficiency.
See less
See more3D perception is crucial for autonomous systems, particularly in autonomous driving, where understanding complex environments is essential. Despite advancements, significant challenges remain in model robustness, efficiency, and multi-task coordination. This thesis addresses these challenges to enhance 3D perception capabilities for autonomous driving. This thesis begins with an innovative self-supervised depth estimation framework, characterized by two proposed robust cross-view consistency losses. By advancing the previous cross-view alignment paradigm, this framework bolsters the system’s resilience to dynamic scenes and occlusions, laying a robust foundation for subsequent tasks. Following the acquisition of point-level geometry through the depth estimation task, this thesis delves deeper into the robust perception of point-level semantics within the LiDAR semantic segmentation task, especially in adverse weather conditions. UniMix, a universal method designed for LiDAR semantic segmentation models, fosters adaptability and robustness across diverse weather conditions via unsupervised domain adaptation and domain generalization techniques. Moving forward, this thesis addresses challenges in object-level geometry and semantics perception within the 3D object detection task while alleviating the model efficiency issue. SimDistill, an efficient simulated multi-modal distillation methodology for 3D object detection, harnesses the power of knowledge distillation techniques to reduce computational complexity and ensure cost-effective deployment while maintaining high performance. Lastly, this thesis proposes a holistic approach leveraging the complementary information among diverse 3D perception tasks to amplify robustness, efficiency, and versatility. JPerceiver is a synergetic joint perception framework tailored for scale-aware depth estimation, visual odometry, and BEV layout estimation, achieves promising performance in both accuracy and efficiency.
See less
Date
2024Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare