Multimodal learning and inference from visual and remotely sensed data
Type
ArticleAbstract
Autonomous vehicles are often tasked to explore unseen environments, aiming to acquire and understand large amounts of visual image data and other sensory information. In such scenarios, remote sensing data may be available a priori, and can help to build a semantic model of the ...
See moreAutonomous vehicles are often tasked to explore unseen environments, aiming to acquire and understand large amounts of visual image data and other sensory information. In such scenarios, remote sensing data may be available a priori, and can help to build a semantic model of the environment and plan future autonomous missions. In this paper, we introduce two multimodal learning algorithms to model the relationship between visual images taken by an autonomous underwater vehicle during a survey and remotely sensed acoustic bathymetry (ocean depth) data that is available prior to the survey. We present a multi-layer architecture to capture the joint distribution between the bathymetry and visual modalities. We then propose an extension based on gated feature learning models, which allows the model to cluster the input data in an unsupervised fashion and predict visual image features using just the ocean depth information. Our experiments demonstrate that multimodal learning improves semantic classification accuracy regardless of which modalities are available at classification time, allows for unsupervised clustering of either or both modalities, and can facilitate mission planning by enabling class-based or image-based queries.
See less
See moreAutonomous vehicles are often tasked to explore unseen environments, aiming to acquire and understand large amounts of visual image data and other sensory information. In such scenarios, remote sensing data may be available a priori, and can help to build a semantic model of the environment and plan future autonomous missions. In this paper, we introduce two multimodal learning algorithms to model the relationship between visual images taken by an autonomous underwater vehicle during a survey and remotely sensed acoustic bathymetry (ocean depth) data that is available prior to the survey. We present a multi-layer architecture to capture the joint distribution between the bathymetry and visual modalities. We then propose an extension based on gated feature learning models, which allows the model to cluster the input data in an unsupervised fashion and predict visual image features using just the ocean depth information. Our experiments demonstrate that multimodal learning improves semantic classification accuracy regardless of which modalities are available at classification time, allows for unsupervised clustering of either or both modalities, and can facilitate mission planning by enabling class-based or image-based queries.
See less
Date
2016-01-01Source title
The International Journal of Robotics ResearchVolume
36Issue
1Publisher
SageFunding information
ARC LP150101135Licence
Copyright All Rights ReservedFaculty/School
Faculty of Engineering, School of Aerospace Mechanical and Mechatronic EngineeringDepartment, Discipline or Centre
Australian Centre for Field RoboticsShare