Show simple item record

FieldValueLanguage
dc.contributor.authorWang, Yifan
dc.date.accessioned2024-03-05T21:31:09Z
dc.date.available2024-03-05T21:31:09Z
dc.date.issued2024en_AU
dc.identifier.urihttps://hdl.handle.net/2123/32309
dc.descriptionIncludes publication
dc.description.abstractWith the development of deep learning technologies and large-scale 3D point cloud datasets, 3D visual grounding tasks have become increasingly attractive. Although many recent studies have achieved satisfactory results, most recent 3D visual grounding datasets use human-written descriptions, which can be hard to modify or extend. Additionally, the quality of these descriptions can vary widely. In this paper, we introduce the 3DSSG-Cap dataset, which contains 383,438 descriptions of 27K objects from 1,465 indoor scenes. The descriptions in this dataset are generated using templates, offering flexibility and ease of extension. We also propose a novel method called 3DETRefer to localize the described objects in the 3DSSG-Cap dataset. Our approach incorporates a transformer-based detector and a visual grounding fusion module, enabling accurate object localization and identification.en_AU
dc.language.isoenen_AU
dc.subject3D visual groundingen_AU
dc.subjecttransformeren_AU
dc.subjectdetectionen_AU
dc.subjectpoint cloudsen_AU
dc.titleTransformer-based 3D visual grounding with point clouds for object detectionen_AU
dc.typeThesis
dc.type.thesisMasters by Researchen_AU
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en_AU
usyd.facultySeS faculties schools::Faculty of Engineering::School of Computer Scienceen_AU
usyd.degreeMaster of Philosophy M.Philen_AU
usyd.awardinginstThe University of Sydneyen_AU
usyd.advisorCai, Weidong
usyd.include.pubYesen_AU


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.