Cognition-Aware Deep Learning Models for Saliency Detection

Yan, Ke

Permalink

Access status:

USyd Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Yan, Ke

Abstract

Saliency on an image is defined as the meaningful and attractive region corresponding to human visual perception and cognition systems. The detection of saliency is important in computer vision and is the fundamental step for many vision applications, such as image resizing, ...
See moreSaliency on an image is defined as the meaningful and attractive region corresponding to human visual perception and cognition systems. The detection of saliency is important in computer vision and is the fundamental step for many vision applications, such as image resizing, content-aware image cropping, action recognition and visual tracking. The mainstream deep learning models for saliency detection include the end-to-end framework that uses deep neural networks (DNNs) to automatically learn image features and saliency characteristics for the localization and segmentation of saliency. However, such methods face the following three challenges: Challenge 1: As DNNs tend to learn abstract and intrinsic knowledge, the detailed information (such as object boundary and shape) is inevitably missed in deeper layers of DNNs. Therefore, DNNs have limited capability of dealing with irregular shape and complicated boundary of the objects such as the tumors and anatomic structures on medical images. Challenge 2: In complex scenarios, there exist mixed texture distributions between foreground and background that impede DNNs in learning intrinsic and discriminative saliency characteristics. Therefore, DNNs do not have enough capability of dealing with such challenging images. Challenge 3: DNNs, reliant on human visual perception, are not capable of differentiating primary saliency and secondary saliency that requires the inclusion of human cognitive learning and thinking. The lack of human cognition is a current common challenge in saliency detection and other computer vision tasks, which is a big hurdle to evolve the machine intelligence to human intelligence. Some investigators conducted pioneering research work on cognition and proposed attentive modules to reflect human attention, which is a big step in cognition study. However, as such attention is the first step of human cognitive learning process, the research on cognition is in its infancy. In this thesis, we aim to explore and propose cognition-aware deep learning models that incorporate human cognition with machine intelligence step by step, to tackle the three challenges in saliency detection. The contributions of this thesis are summarized as follows. 1. To address Challenge 1, we used superpixels as prior-knowledge that encodes boundary information of objects and transferred the prior-knowledge to the working memory of DNNs. Such additional prior-knowledge complements the previously learned knowledge of DNNs, guiding DNNs in the segmentation of saliency. 2. To address Challenge 2, consisting to human acquiring knowledge from multiple sources for appropriate decision making, our DNNs also learn saliency knowledge from different sources (such as sparse and dense labeling schemes). Therefore, our DNNs are not limited by a specific knowledge but they could retrieve saliency from more complicated scenarios using the multiple-source knowledge. 3. To address Challenge 3, we innovatively propose to mimic and embody the process of human’s cognitive thinking of images. Consisting to the natural process of saliency detection by human, our DNNs progressively learn and encode saliency knowledge as working memory in three phases (‘Seeing’ - ‘Perceiving’ - ‘Cogitating), which is a higher-level learning on the top of the existing attentive learning.
See less

Date

2020

Publisher

University of Sydney

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

saliency detection
deep neural networks
cognition