Show simple item record

dc.contributor.authorFu, Huan
dc.date.accessioned2019-03-08T00:35:12Z
dc.date.available2019-03-08T00:35:12Z
dc.date.issued2019-03-08
dc.identifier.urihttp://hdl.handle.net/2123/20123
dc.description.abstractDense prediction or pixel-level labeling targets at predicting labels of interest (e.g., categories, depth values, flow vectors, and edge probabilities) for each pixel of an input image. This middle-level computer vision problem plays a crucial role in establishing visual perception systems for the future intelligent world. Therefore, tremendous efforts have been made in the past decades to explore the solution for robust dense prediction, and recent studies have continuously obtained significant progress relying on deep Fully Convolutional Networks (FCNs). According to the expected label, dense prediction contains a set of subtasks. Building robust models for each task must examine the particular property, but the main intuition and motivation for the network architecture development are shared across different tasks. In the thesis, we take the well-known problems of scene parsing, monocular depth estimation, and edge detection as examples, and devise some advanced and highly extensible techniques by addressing both the individual and collective issues for robust dense prediction. Specific to scene parsing, employing hierarchical convolutional features is essential to obtain high-resolution and fine-grained predictions. Previous algorithms regularly aggregate them via concatenation or linear combination, which cannot sufficiently exploit the diversities of the contextual information and the spatial inhomogeneity of a scene. We propose some novel attention mechanisms, i.e., adaptive hierarchical feature aggregation (AHFA) and mixture-of-experts (MoE), to re-weight different levels of features at each spatial location according to the local structure and surrounding contextual information before aggregation. Existing works on depth estimation often overlook the strong inherent ordinal correlation of depth values resulting in inferior performance. Motivated by the observation, we introduce the ranking mechanism for depth estimation by proposing an effective ordinal regression constraint. For edge detection, common approaches simply predict the boundary probability for each pixel individually from the receptive fields where the pixel is centered at. Differently, we propose that modeling the boundary structures or position sensitive scores are more flexible because of the implied feature competition for the prediction of each spatial position. We also study unsupervised domain mapping which is of general applicability, enabling a consolidated solution for dense prediction. Advanced unsupervised domain mapping approaches mainly rely on Generative Adversarial Networks (GANs) to make the prediction indistinguishable from reality (e.g., generated pseudo parsing vs. truth parsing), and reduce the solution space with high-level constraints and assumptions to guarantee that an input and the corresponding output are paired up in a meaningful way in the absence of unmatched training samples. However, they overlook the special property of images that simple geometric transformations do not change the semantics of an image. With that motivation, we propose to enforce geometry consistency as a constraint and demonstrate that it can largely eliminate unreasonable mappings and produce more reliable solutions.en_AU
dc.publisherUniversity of Sydneyen_AU
dc.publisherFaculty of Engineering and Information Technologiesen_AU
dc.rightsThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en_AU
dc.subjectdense predictionen_AU
dc.subjectdomain mappingen_AU
dc.subjectscene parsingen_AU
dc.subjectGANsen_AU
dc.subjectdepth estimationen_AU
dc.subjectboundary detectionen_AU
dc.titleRobust Dense Prediction for Visual Perceptionen_AU
dc.typePhD Doctorateen_AU
dc.type.pubtypeDoctor of Philosophy Ph.D.en_AU
dc.description.disclaimerAccess is restricted to staff and students of the University of Sydney . UniKey credentials are required. Non university access may be obtained by visiting the University of Sydney Library.en_AU


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record