Towards Surgical Intelligence with Deep Learning-Based Surgical Video Analysis

Yue, Wenxi

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Yue, Wenxi

Abstract

Advancements in medicine and information technology have revolutionised surgery, with computer-assisted procedures integrating advanced computer technology to aid interventions. This thesis presents interdisciplinary studies developing state-of-the-art deep learning methods for ...
See moreAdvancements in medicine and information technology have revolutionised surgery, with computer-assisted procedures integrating advanced computer technology to aid interventions. This thesis presents interdisciplinary studies developing state-of-the-art deep learning methods for surgical video analysis, focusing on two key aspects of surgery: temporal dynamics and spatial comprehension, by two pivotal tasks: surgical workflow analysis and instrument segmentation. First, we focus on surgical workflow analysis and observe that existing methods extract temporal context solely at the frame level and aggregate homogeneous contextual information for all frames. To address this problem, we propose a Cascaded Multi-Level Transformer Network that extracts both frame-level and phase-level temporal context and fuses them with spatial features in a frame-adaptive manner, thereby improving performance. Next, with the emergence of foundation models, we explore the adaptation of the Segment Anything Model (SAM) to the surgical domain for instrument segmentation. We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to effectively integrate surgical-specific information with SAM’s pre-trained knowledge for improved generalisation. Moreover, we explore the domain gap between natural objects and surgical instruments, recognising that the critical distinction lies in the complex structures and fine-grained details of various surgical instruments. To address this challenge, we propose SurgicalPart-SAM, which explicitly integrates surgical instrument structure knowledge to improve the understanding and differentiation of instrument categories. Through these contributions, we advance state-of-the-art deep learning methods for surgical video analysis, enhancing performance while reducing development costs. These advancements lead to improved accuracy in computer-assisted surgical systems and greater accessibility of surgical technology for healthcare institutions.
See less

Date

2024

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Deep learning
Foundation model
Computer-assisted surgery
Surgical video analysis
Surgical workflow analysis
Surgical instrument segmentation