首页 正文

Review Journal of cardiothoracic and vascular anesthesia. 2024 May;38(5):1251-1259. doi: 10.1053/j.jvca.2024.01.032 Q22.12025

Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models

人工智能在麻醉学委员会风格考试题目中的作用:大型语言模型的作用 翻译改进

Adnan A Khan  1, Rayaan Yunus  1, Mahad Sohail  1, Taha A Rehman  1, Shirin Saeed  1, Yifan Bu  1, Cullen D Jackson  1, Aidan Sharkey  1, Feroze Mahmood  1, Robina Matyal  2

作者单位 +展开

作者单位

  • 1 Department of Anesthesia, Critical Care, and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School Boston, MA.
  • 2 Department of Anesthesia, Critical Care, and Pain Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School Boston, MA. Electronic address: rmatyal1@bidmc.harvard.edu.
  • DOI: 10.1053/j.jvca.2024.01.032 PMID: 38423884

    摘要 Ai翻译

    New artificial intelligence tools have been developed that have implications for medical usage. Large language models (LLMs), such as the widely used ChatGPT developed by OpenAI, have not been explored in the context of anesthesiology education. Understanding the reliability of various publicly available LLMs for medical specialties could offer insight into their understanding of the physiology, pharmacology, and practical applications of anesthesiology. An exploratory prospective review was conducted using 3 commercially available LLMs--OpenAI's ChatGPT GPT-3.5 version (GPT-3.5), OpenAI's ChatGPT GPT-4 (GPT-4), and Google's Bard--on questions from a widely used anesthesia board examination review book. Of the 884 eligible questions, the overall correct answer rates were 47.9% for GPT-3.5, 69.4% for GPT-4, and 45.2% for Bard. GPT-4 exhibited significantly higher performance than both GPT-3.5 and Bard (p = 0.001 and p < 0.001, respectively). None of the LLMs met the criteria required to secure American Board of Anesthesiology certification, according to the 70% passing score approximation. GPT-4 significantly outperformed GPT-3.5 and Bard in terms of overall performance, but lacked consistency in providing explanations that aligned with scientific and medical consensus. Although GPT-4 shows promise, current LLMs are not sufficiently advanced to answer anesthesiology board examination questions with passing success. Further iterations and domain-specific training may enhance their utility in medical education.

    Keywords: ABA; LLMs; anesthesia; artificial intelligence; education; machine learning; residency; training.

    Keywords:Artificial Intelligence; Large Language Models

    请您完成人机验证后继续浏览
    Copyright © Journal of cardiothoracic and vascular anesthesia. 中文内容为AI机器翻译,仅供参考!

    相关内容

    期刊名:Journal of cardiothoracic and vascular anesthesia

    缩写:J CARDIOTHOR VASC AN

    ISSN:1053-0770

    e-ISSN:1532-8422

    IF/分区:2.1/Q2

    文章目录 更多期刊信息

    全文链接
    引文链接
    复制
    已复制!
    推荐内容
    Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models