Deep Learning for Visual Data Compression

Chen, Zhenghao

Access status:

USyd Access

Field	Value	Language
dc.contributor.author	Chen, Zhenghao
dc.date.accessioned	2022-11-21T00:57:22Z
dc.date.available	2022-11-21T00:57:22Z
dc.date.issued	2022	en_AU
dc.identifier.uri	https://hdl.handle.net/2123/29729
dc.description	Includes publication
dc.description.abstract	With the tremendous success of neural networks, a few learning-based image codecs were proposed and outperformed those traditional image codecs. However, the field of learning-based compression research for other categories of visual data has remained much less explored. This thesis will investigate the effectiveness of deep learning for visual data compression and propose three end-to-end learning-based compression methods for respectively compressing standard videos, 3D volumetric images and stereo videos. First, we improve the existing learning-based video codecs by using a newly proposed adaptive coding method called Resolution-adaptive Motion Coding (RaMC) to effectively compress the introduced motion information for reducing the bit-rate cost. Then, we investigate the effectiveness of deep learning for lossless 3D volumetric image compression and propose the first end-to-end optimized learning framework for losslessly compressing 3D volumetric images. We introduce an Intra-slice and Inter-slice Conditional Entropy Coding (ICEC) module to fuse multi-scale intra-slice and inter-slice features as the context information for better entropy coding. Besides the aforementioned single-view visual data, we further attempt to employ the neural networks for compressing the multi-view visual data and propose the first end-to-end Learning-based Stereo Video Compression (LSVC) framework. It compresses both left and right views of the stereo video by using deep motion and disparity compensation strategy with fully-differentiable modules and can be optimized in an end-to-end manner. We conduct extensive experiments on multiple publicly available datasets to demonstrate the effectiveness of our proposed RaMC, ICEC, and LSVC methods. The results indicate that these three methods achieve state-of-the-art compression performance in the corresponding visual data compression tasks and outperform traditional visual data compression frameworks.	en_AU
dc.language.iso	en	en_AU
dc.subject	deep learning	en_AU
dc.subject	computer vision	en_AU
dc.subject	data compression	en_AU
dc.title	Deep Learning for Visual Data Compression	en_AU
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en_AU
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en_AU
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Electrical and Information Engineering	en_AU
usyd.degree	Doctor of Philosophy Ph.D.	en_AU
usyd.awardinginst	The University of Sydney	en_AU
usyd.advisor	Xu, Dong
usyd.include.pub	Yes	en_AU