MULTIMODAL GRAPH REPRESENTATION LEARNING FOR ROBUST SURGICAL WORKFLOW RECOGNITION WITH ADVERSARIAL FEATURE DISENTANGLEMENT
Keywords:
Surgical Workflow Recognition, Multimodal Data Fusion, Graph Convolutional Networks (GCN), Robotic-Assisted Surgery, MDGNetAbstract
Recognizing the workflow of surgeries is really important for automating tasks and making sure patients are safe. When the data gets corrupted it becomes a big problem. This document talks about an approach that uses graphs and combines what we see and the movement of things to make things more accurate even when conditions are tough. The Multimodal Disentanglement Graph Network or MDGNet for short looks at how what we see. The movement of things work together using a special framework to make sure the features match up. The Contextual Calibrated Decoder uses information about time and context to make the system more resilient to changes and corruption of data. This helps the Surgical workflow recognition system to work. The Surgical workflow recognition system is important, for safety and the Multimodal Disentanglement Graph Network helps it to work more accurately. The model achieved accuracies of 86.87% and 92.38% on two datasets, demonstrating effectiveness in addressing data corruption issues and advancing automated surgical workflow recognition.













