Deep Learning Based Novel View Synthesis

Xie, Ke

Access status:

USyd Access

Field	Value	Language
dc.contributor.author	Xie, Ke
dc.date.accessioned	2026-01-15T04:34:46Z
dc.date.available	2026-01-15T04:34:46Z
dc.date.issued	2025	en
dc.identifier.uri	https://hdl.handle.net/2123/34714
dc.description	Includes publication
dc.description.abstract	The synthesis of high-fidelity, large-scale urban scenes is central to autonomous driving simulation and validation. Explicit radiance fields, especially 3D Gaussian Splatting (3DGS), now dominate this area thanks to their realism and real-time rendering, yet they struggle in dynamic street environments: far-field backgrounds degrade into structural blur, color noise, and temporal flickering because sparse photometric cues under-constrain distant geometry. We propose SeeDepthGaussian, a 3DGS-based framework that injects monocular depth estimation, a readily available signal in autonomous driving, as a geometric prior to regularize the spatial layout of 3D Gaussian primitives. All depth-related constraints are applied only during training, improving geometry and visual fidelity without adding inference-time cost, thus preserving the real-time nature of 3DGS. SeeDepthGaussian makes three key contributions. First, a dual depth regularization scheme for dynamic scenes: a rigid term enforces sharp, occlusion-aware surfaces, while a flexible term refines semi-transparent and thin structures via opacity-weighted depth. Second, a Global–Local Depth Normalization strategy combines local patch normalization, which is sensitive to fine details, with global normalization that maintains scene-wide metric consistency. Third, we replace spherical harmonics with a neural background renderer, which better captures complex textures and view-dependent lighting while reducing overfitting. Experiments on the challenging Waymo Open Dataset show consistent gains over strong baselines on all major metrics, including a 0.21 dB improvement in PSNR. Qualitative results confirm clearer and more stable distant views with suppressed blur and color noise. Overall, SeeDepthGaussian demonstrates that depth-aware training is an effective route to high-fidelity reconstruction of large-scale, dynamic urban scenes.	en
dc.language.iso	en	en
dc.subject	Computer Vision	en
dc.subject	3D Gaussian Splatting	en
dc.subject	Novel View Synthesis	en
dc.subject	Generative AI	en
dc.title	Deep Learning Based Novel View Synthesis	en
dc.type	Thesis
dc.type.thesis	Masters by Research	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en
usyd.degree	Master of Philosophy M.Phil	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Cai, Tom
usyd.include.pub	Yes	en