We are happy to share that we have three papers accepted to the 16th International Conference on Information Processing in Computer-Assisted Interventions (IPCAI 2025). Congrats to all authors!
- Robotic CBCT meets Robotic Ultrasound
Feng Li, Yuan Bi, Dianye Huang, Zhongliang Jiang, Nassir Navab
In this paper, we present a novel clinical setup where robotic cone beam computed tomography (CBCT) and robotic ultrasound (US) are pre-calibrated and dynamically co-registered, enabling new clinical applications. This setup allows registration-free rigid registration, facilitating multi-modal guided procedures in the absence of tissue deformation. - Text-driven Adaptation of Foundation Models for Few-shot Surgical Workflow Analysis
Tingxuan Chen, Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy
Surg-FTDA introduces a few-shot text-driven adaptation framework for surgical workflow analysis, aligning image and text embeddings through modality transformation and leveraging text-only training for generative and discriminative tasks. Validated on CLIP and SurgVLP, it minimizes annotation needs while ensuring robust performance across recognition and captioning tasks. - FLex: Joint Pose and Dynamic Radiance FieldsOptimization for Stereo Endoscopic Videos
Florian Stilz*, Mert Karaoglu*, Felix Tristram*, Nassir Navab, Benjamin Busam, Alexander Ladikos
In this work we tackle the difficult task of simultaneous reconstruction and camera pose estimation of highly deformable endoscopic environments. We do so by combining multiple 4D NeRF models in a progressive optimisation scheme that is supervised with depth and optical flow losses. As a byproduct of this combination of multiple 4D models our method can scale to very long video recordings without running unit computation constraints. Our method showcases state of the art performance in Novel View Synthesis while estimation camera poses that are comparable to fully supervised dedicated pose estimation methods.