Enabling Plug-and-Play Cameras: Generalisable Methods for Self-Calibration and Multi-Modal Vision Systems
| Field | Value | Language |
| dc.contributor.author | Griffiths, Ryan Ben | |
| dc.date.accessioned | 2026-03-30T03:49:23Z | |
| dc.date.available | 2026-03-30T03:49:23Z | |
| dc.date.issued | 2026 | en |
| dc.identifier.uri | https://hdl.handle.net/2123/35053 | |
| dc.description.abstract | Vision systems are foundational to a wide range of real-world applications, including autonomous vehicles navigating complex urban environments, drones performing infrastructure inspection, and robots operating in hazardous or remote settings. These applications increasingly depend on diverse and specialised camera hardware, such as fisheye, thermal, and multimodal systems, which challenge the assumptions of conventional computer vision pipelines. Existing approaches typically require labour-intensive calibration, handcrafted adaptation for each camera type, and large labelled datasets. This thesis addresses the central question: how can we build plug-and-play vision systems? We present three key contributions in the areas of self-calibration, network adaptation, and multimodal fusion. First, we introduce NOCaL, a semi-supervised framework that jointly estimates camera intrinsics, distortion, and odometry using a rendering-based self-supervision signal. Second, we propose RectConv, a deformable convolutional layer that enables pretrained convolutional neural networks to operate on previously unseen camera geometries such as fisheye lenses. Third, we develop a transformer-based architecture for multi-camera, multi-modal integration. The model introduces a ray-based rotary positional embedding that enables effective integration of RGB and thermal imagery into a shared, geometrically consistent scene representation. Together, they demonstrate that ray-based, self-supervised representations can support flexible and generalisable vision systems that adapt to new hardware and sensing configurations. The contributions of this thesis have potential impact in domains where robust perception is critical, such as autonomous navigation, environmental monitoring, planetary exploration, and field robotics. This work helps pave the way towards more adaptable, accessible, and intelligent machine perception. | en |
| dc.language.iso | en | en |
| dc.subject | Cameras | en |
| dc.subject | Calibration | en |
| dc.subject | Computer Vision | en |
| dc.subject | Multi-Modal | en |
| dc.subject | Rays | en |
| dc.title | Enabling Plug-and-Play Cameras: Generalisable Methods for Self-Calibration and Multi-Modal Vision Systems | en |
| dc.type | Thesis | |
| dc.type.thesis | Doctor of Philosophy | en |
| dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en |
| usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Aerospace Mechanical and Mechatronic Engineering | en |
| usyd.degree | Doctor of Philosophy Ph.D. | en |
| usyd.awardinginst | The University of Sydney | en |
| usyd.advisor | Dansereau, Donald | |
| usyd.include.pub | No | en |
Associated file/s
Associated collections