Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination

Study aims and objectives. This study aims to evaluate the accuracy of medical knowledge in the most advanced LLMs (GPT-4o, GPT-4, Gemini 1.5 Pro, and Claude 3 Opus) as of 2024. It is the first to evaluate these LLMs using a non-English medical licensing exam.... ...

请注册登录后继续浏览