Towards Surgical Intelligence with Deep Learning-Based Surgical Video Analysis

Yue, Wenxi

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Yue, Wenxi
dc.date.accessioned	2025-01-07T03:16:21Z
dc.date.available	2025-01-07T03:16:21Z
dc.date.issued	2024	en_AU
dc.identifier.uri	https://hdl.handle.net/2123/33511
dc.description.abstract	Advancements in medicine and information technology have revolutionised surgery, with computer-assisted procedures integrating advanced computer technology to aid interventions. This thesis presents interdisciplinary studies developing state-of-the-art deep learning methods for surgical video analysis, focusing on two key aspects of surgery: temporal dynamics and spatial comprehension, by two pivotal tasks: surgical workflow analysis and instrument segmentation. First, we focus on surgical workflow analysis and observe that existing methods extract temporal context solely at the frame level and aggregate homogeneous contextual information for all frames. To address this problem, we propose a Cascaded Multi-Level Transformer Network that extracts both frame-level and phase-level temporal context and fuses them with spatial features in a frame-adaptive manner, thereby improving performance. Next, with the emergence of foundation models, we explore the adaptation of the Segment Anything Model (SAM) to the surgical domain for instrument segmentation. We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to effectively integrate surgical-specific information with SAM’s pre-trained knowledge for improved generalisation. Moreover, we explore the domain gap between natural objects and surgical instruments, recognising that the critical distinction lies in the complex structures and fine-grained details of various surgical instruments. To address this challenge, we propose SurgicalPart-SAM, which explicitly integrates surgical instrument structure knowledge to improve the understanding and differentiation of instrument categories. Through these contributions, we advance state-of-the-art deep learning methods for surgical video analysis, enhancing performance while reducing development costs. These advancements lead to improved accuracy in computer-assisted surgical systems and greater accessibility of surgical technology for healthcare institutions.	en_AU
dc.language.iso	en	en_AU
dc.subject	Deep learning	en_AU
dc.subject	Foundation model	en_AU
dc.subject	Computer-assisted surgery	en_AU
dc.subject	Surgical video analysis	en_AU
dc.subject	Surgical workflow analysis	en_AU
dc.subject	Surgical instrument segmentation	en_AU
dc.title	Towards Surgical Intelligence with Deep Learning-Based Surgical Video Analysis	en_AU
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en_AU
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en_AU
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en_AU
usyd.degree	Doctor of Philosophy Ph.D.	en_AU
usyd.awardinginst	The University of Sydney	en_AU
usyd.advisor	WANG, ZHIYONG