Show simple item record

FieldValueLanguage
dc.contributor.authorYue, Wenxi
dc.date.accessioned2025-01-07T03:16:21Z
dc.date.available2025-01-07T03:16:21Z
dc.date.issued2024en_AU
dc.identifier.urihttps://hdl.handle.net/2123/33511
dc.description.abstractAdvancements in medicine and information technology have revolutionised surgery, with computer-assisted procedures integrating advanced computer technology to aid interventions. This thesis presents interdisciplinary studies developing state-of-the-art deep learning methods for surgical video analysis, focusing on two key aspects of surgery: temporal dynamics and spatial comprehension, by two pivotal tasks: surgical workflow analysis and instrument segmentation. First, we focus on surgical workflow analysis and observe that existing methods extract temporal context solely at the frame level and aggregate homogeneous contextual information for all frames. To address this problem, we propose a Cascaded Multi-Level Transformer Network that extracts both frame-level and phase-level temporal context and fuses them with spatial features in a frame-adaptive manner, thereby improving performance. Next, with the emergence of foundation models, we explore the adaptation of the Segment Anything Model (SAM) to the surgical domain for instrument segmentation. We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to effectively integrate surgical-specific information with SAM’s pre-trained knowledge for improved generalisation. Moreover, we explore the domain gap between natural objects and surgical instruments, recognising that the critical distinction lies in the complex structures and fine-grained details of various surgical instruments. To address this challenge, we propose SurgicalPart-SAM, which explicitly integrates surgical instrument structure knowledge to improve the understanding and differentiation of instrument categories. Through these contributions, we advance state-of-the-art deep learning methods for surgical video analysis, enhancing performance while reducing development costs. These advancements lead to improved accuracy in computer-assisted surgical systems and greater accessibility of surgical technology for healthcare institutions.en_AU
dc.language.isoenen_AU
dc.subjectDeep learningen_AU
dc.subjectFoundation modelen_AU
dc.subjectComputer-assisted surgeryen_AU
dc.subjectSurgical video analysisen_AU
dc.subjectSurgical workflow analysisen_AU
dc.subjectSurgical instrument segmentationen_AU
dc.titleTowards Surgical Intelligence with Deep Learning-Based Surgical Video Analysisen_AU
dc.typeThesis
dc.type.thesisDoctor of Philosophyen_AU
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en_AU
usyd.facultySeS faculties schools::Faculty of Engineering::School of Computer Scienceen_AU
usyd.degreeDoctor of Philosophy Ph.D.en_AU
usyd.awardinginstThe University of Sydneyen_AU
usyd.advisorWANG, ZHIYONG


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.