Multi-view Chest X-Ray Vision-Language Pre-training via Semantic-Aware Masked Language Modeling and High-order Alignment

Chest X-Ray Vision-Language pretraining (VLP) leverages large-scale radiograph-report pairs to develop joint image-text representations, demonstrating significant potential for medical image diagnosis. However, existing VLP approaches often overlook the multi-... ...

请注册登录后继续浏览