Deep Learning for Visual Data Compression
Field | Value | Language |
dc.contributor.author | Chen, Zhenghao | |
dc.date.accessioned | 2022-11-21T00:57:22Z | |
dc.date.available | 2022-11-21T00:57:22Z | |
dc.date.issued | 2022 | en_AU |
dc.identifier.uri | https://hdl.handle.net/2123/29729 | |
dc.description | Includes publication | |
dc.description.abstract | With the tremendous success of neural networks, a few learning-based image codecs were proposed and outperformed those traditional image codecs. However, the field of learning-based compression research for other categories of visual data has remained much less explored. This thesis will investigate the effectiveness of deep learning for visual data compression and propose three end-to-end learning-based compression methods for respectively compressing standard videos, 3D volumetric images and stereo videos. First, we improve the existing learning-based video codecs by using a newly proposed adaptive coding method called Resolution-adaptive Motion Coding (RaMC) to effectively compress the introduced motion information for reducing the bit-rate cost. Then, we investigate the effectiveness of deep learning for lossless 3D volumetric image compression and propose the first end-to-end optimized learning framework for losslessly compressing 3D volumetric images. We introduce an Intra-slice and Inter-slice Conditional Entropy Coding (ICEC) module to fuse multi-scale intra-slice and inter-slice features as the context information for better entropy coding. Besides the aforementioned single-view visual data, we further attempt to employ the neural networks for compressing the multi-view visual data and propose the first end-to-end Learning-based Stereo Video Compression (LSVC) framework. It compresses both left and right views of the stereo video by using deep motion and disparity compensation strategy with fully-differentiable modules and can be optimized in an end-to-end manner. We conduct extensive experiments on multiple publicly available datasets to demonstrate the effectiveness of our proposed RaMC, ICEC, and LSVC methods. The results indicate that these three methods achieve state-of-the-art compression performance in the corresponding visual data compression tasks and outperform traditional visual data compression frameworks. | en_AU |
dc.language.iso | en | en_AU |
dc.subject | deep learning | en_AU |
dc.subject | computer vision | en_AU |
dc.subject | data compression | en_AU |
dc.title | Deep Learning for Visual Data Compression | en_AU |
dc.type | Thesis | |
dc.type.thesis | Doctor of Philosophy | en_AU |
dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en_AU |
usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Electrical and Information Engineering | en_AU |
usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
usyd.awardinginst | The University of Sydney | en_AU |
usyd.advisor | Xu, Dong | |
usyd.include.pub | Yes | en_AU |
Associated file/s
Associated collections