Light Field Image Processing and Applications with Deep Learning
| Field | Value | Language |
| dc.contributor.author | Hu, Zeke Zexi | |
| dc.date.accessioned | 2025-07-24T05:14:35Z | |
| dc.date.available | 2025-07-24T05:14:35Z | |
| dc.date.issued | 2025 | en |
| dc.identifier.uri | https://hdl.handle.net/2123/34144 | |
| dc.description | Includes publication | |
| dc.description.abstract | Light field (LF) imaging captures both spatial and angular information of light rays, offering a four-dimensional representation of a scene that enables applications such as refocusing, depth estimation, and immersive media. However, practical deployment of LF imaging faces two critical challenges: low spatial resolution due to inherent hardware constraints, and high data volume impeding efficient transmission. This thesis addresses these challenges through novel contributions in light field super-resolution (LFSR) and light field transmission. To enhance spatial resolution, we first propose the Many-to-Many Transformer (M2MT), which mitigates the subspace isolation problem common in existing deep learning approaches. M2MT encodes unexposed LF dimensions as channel embeddings, enabling global spatial-angular modelling and achieving state-of-the-art LFSR performance. We then introduce SkimLFSR, a lightweight yet effective network that alleviates disparity entanglement by selectively incorporating structurally informative subviews, following a “less is more” strategy that improves both accuracy and efficiency. In the context of LF transmission, we develop a user-centric framework that integrates angular attention modelling with video compression. To support this, we construct LF-EMT12, the first eye-tracking dataset for LF viewing, and design LF3A-Net to estimate user attention over subviews. Our approach enables selective subview transmission based on predicted user focus, significantly reducing bandwidth requirements while maintaining perceptual quality. Collectively, these contributions advance the state-of-the-art in LF image processing and transmission. They provide new insights into structured representation learning and user-adaptive LF systems, laying a foundation for future research and practical applications in computational photography, immersive media, and beyond. | en |
| dc.language.iso | en | en |
| dc.subject | light field | en |
| dc.subject | super-resolution | en |
| dc.subject | image processing | en |
| dc.subject | self-attention | en |
| dc.subject | Transformer | en |
| dc.title | Light Field Image Processing and Applications with Deep Learning | en |
| dc.type | Thesis | |
| dc.type.thesis | Doctor of Philosophy | en |
| dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en |
| usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Computer Science | en |
| usyd.degree | Doctor of Philosophy Ph.D. | en |
| usyd.awardinginst | The University of Sydney | en |
| usyd.advisor | Chung, Vera | |
| usyd.include.pub | Yes | en |
Associated file/s
Associated collections