Scalable Content-Based Image and Video Retrieval
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Zhang, LelinAbstract
The popularity of the Internet and portable image capturing devices brings in unprecedented amount of images and videos. Content-based visual search provides an important tool for users to consume the ever-growing digital media repositories, and is becoming an increasingly demanding ...
See moreThe popularity of the Internet and portable image capturing devices brings in unprecedented amount of images and videos. Content-based visual search provides an important tool for users to consume the ever-growing digital media repositories, and is becoming an increasingly demanding task as never before. In this thesis, we focus on improving the scalability, efficiency and usability of content-based image and video retrieval systems, particularly in dynamic and open environments. Towards our goal, we make four contributions to the research community. First, we propose a scalable approach to adopt bag-of-visual-words (BoVW) to content-based image retrieval (CBIR) in peer-to-peer (P2P) networks. To overcome the dynamic P2P environment, we propose a distributed codebook updating algorithm based on splitting/merging of individual codewords, which maintains the workload balance in the network churn. Our approach offers a scalable framework for content-based visual search in P2P environment. Second, we improve the retrieval performance of CBIR with relevance feedback (RF). We formulate the RF process as an energy minimization problem, and utilize graph cuts algorithm to solve the problem and obtain relevant/irrelevant labels for the images. Our method enables flexible partitioning of the feature space and is capable of handling challenging scenarios. Third, we improve the retrieval performance of trajectory based action video retrieval with spatial-temporal context. We exploit the spatial-temporal correlations among trajectories for descriptor coding, and tackle the trajectory segment mis-alignment issue with an offset-aware distance for trajectory matching. Finally, we develop a toolset to improve the efficiency and provide better insight of the BoVW pipeline. Our toolset provides robust integration of different methods, automatic parallel execution and result reusing, and visualization of the retrieval process.
See less
See moreThe popularity of the Internet and portable image capturing devices brings in unprecedented amount of images and videos. Content-based visual search provides an important tool for users to consume the ever-growing digital media repositories, and is becoming an increasingly demanding task as never before. In this thesis, we focus on improving the scalability, efficiency and usability of content-based image and video retrieval systems, particularly in dynamic and open environments. Towards our goal, we make four contributions to the research community. First, we propose a scalable approach to adopt bag-of-visual-words (BoVW) to content-based image retrieval (CBIR) in peer-to-peer (P2P) networks. To overcome the dynamic P2P environment, we propose a distributed codebook updating algorithm based on splitting/merging of individual codewords, which maintains the workload balance in the network churn. Our approach offers a scalable framework for content-based visual search in P2P environment. Second, we improve the retrieval performance of CBIR with relevance feedback (RF). We formulate the RF process as an energy minimization problem, and utilize graph cuts algorithm to solve the problem and obtain relevant/irrelevant labels for the images. Our method enables flexible partitioning of the feature space and is capable of handling challenging scenarios. Third, we improve the retrieval performance of trajectory based action video retrieval with spatial-temporal context. We exploit the spatial-temporal correlations among trajectories for descriptor coding, and tackle the trajectory segment mis-alignment issue with an offset-aware distance for trajectory matching. Finally, we develop a toolset to improve the efficiency and provide better insight of the BoVW pipeline. Our toolset provides robust integration of different methods, automatic parallel execution and result reusing, and visualization of the retrieval process.
See less
Date
2016-01-01Licence
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering and Information Technologies, School of Information TechnologiesAwarding institution
The University of SydneyShare