SurgLaVi: Large-scale hierarchical dataset for surgical vision-language representation learning

Vision-language pre-training (VLP) offers unique advantages for surgery by aligning language with surgical videos, enabling workflow understanding and transfer across tasks without relying on expert-labeled datasets. However, progress in surgical VLP remains c... ...

请注册登录后继续浏览