首页 正文

SurgLaVi: Large-scale hierarchical dataset for surgical vision-language representation learning

{{output}}
Vision-language pre-training (VLP) offers unique advantages for surgery by aligning language with surgical videos, enabling workflow understanding and transfer across tasks without relying on expert-labeled datasets. However, progress in surgical VLP remains c... ...