首页 正文

Comparative Study Journal of orthopaedic surgery and research. 2024 Sep 18;19(1):574. doi: 10.1186/s13018-024-04996-2 Q22.82024

Comparative performance analysis of large language models: ChatGPT-3.5, ChatGPT-4 and Google Gemini in glucocorticoid-induced osteoporosis

大型语言模型在糖皮质激素诱导的骨质疏松症中的性能比较分析:ChatGPT-3.5、ChatGPT-4和Google Gemini的对比研究 翻译改进

Linjian Tong  1, Chaoyang Zhang  2, Rui Liu  1, Jia Yang  1, Zhiming Sun  3

作者单位 +展开

作者单位

  • 1 Clinical College of Neurology, Neurosurgery and Neurorehabilitation, Tianjin Medical University, Tianjin, 300070, China.
  • 2 Department of Orthopedics, Tianjin Medical University Baodi Hospital, Tianjin, 301800, China.
  • 3 Clinical College of Neurology, Neurosurgery and Neurorehabilitation, Tianjin Medical University, Tianjin, 300070, China. szhm0618@163.com.
  • DOI: 10.1186/s13018-024-04996-2 PMID: 39289734

    摘要 中英对照阅读

    Backgrounds: The use of large language models (LLMs) in medicine can help physicians improve the quality and effectiveness of health care by increasing the efficiency of medical information management, patient care, medical research, and clinical decision-making.

    Methods: We collected 34 frequently asked questions about glucocorticoid-induced osteoporosis (GIOP), covering topics related to the disease's clinical manifestations, pathogenesis, diagnosis, treatment, prevention, and risk factors. We also generated 25 questions based on the 2022 American College of Rheumatology Guideline for the Prevention and Treatment of Glucocorticoid-Induced Osteoporosis (2022 ACR-GIOP Guideline). Each question was posed to the LLM (ChatGPT-3.5, ChatGPT-4, and Google Gemini), and three senior orthopedic surgeons independently rated the responses generated by the LLMs. Three senior orthopedic surgeons independently rated the answers based on responses ranging between 1 and 4 points. A total score (TS) > 9 indicated 'good' responses, 6 ≤ TS ≤ 9 indicated 'moderate' responses, and TS < 6 indicated 'poor' responses.

    Results: In response to the general questions related to GIOP and the 2022 ACR-GIOP Guidelines, Google Gemini provided more concise answers than the other LLMs. In terms of pathogenesis, ChatGPT-4 had significantly higher total scores (TSs) than ChatGPT-3.5. The TSs for answering questions related to the 2022 ACR-GIOP Guideline by ChatGPT-4 were significantly higher than those for Google Gemini. ChatGPT-3.5 and ChatGPT-4 had significantly higher self-corrected TSs than pre-corrected TSs, while Google Gemini self-corrected for responses that were not significantly different than before.

    Conclusions: Our study showed that Google Gemini provides more concise and intuitive responses than ChatGPT-3.5 and ChatGPT-4. ChatGPT-4 performed significantly better than ChatGPT3.5 and Google Gemini in terms of answering general questions about GIOP and the 2022 ACR-GIOP Guidelines. ChatGPT3.5 and ChatGPT-4 self-corrected better than Google Gemini.

    Keywords: AI; ChatGPT; Glucocorticoid-Induced osteoporosis; Google Gemini; Large language models.

    Keywords:large language models; chatgpt; google gemini

    背景: 在医学中使用大型语言模型(LLMs)可以帮助医生通过提高医疗信息管理、患者护理、医学研究和临床决策的效率,来改善医疗服务的质量和效果。

    方法: 我们收集了关于糖皮质激素诱发性骨质疏松症(GIOP)的34个常见问题,涵盖了疾病临床表现、发病机制、诊断、治疗、预防以及风险因素等相关主题。另外,根据2022年美国风湿病学会发布的糖皮质激素诱发性骨质疏松症防治指南(2022 ACR-GIOP Guideline)生成了25个问题,并将这些问题分别提问给ChatGPT-3.5、ChatGPT-4和Google Gemini这三个大型语言模型。由三位资深骨科医生独立评分,评分范围为1至4分之间。总评分为TS的分数大于9表示“好”,6≤TS≤9表示“中等”,而TS小于6则表示“差”。

    结果: 对于GIOP和2022 ACR-GIOP指南的一般问题,Google Gemini提供的答案比其他LLMs更为简洁。在发病机制方面,ChatGPT-4的总评分(TS)显著高于ChatGPT-3.5。针对2022 ACR-GIOP指南的问题,ChatGPT-4的回答总评分数明显高于Google Gemini。对于ChatGPT-3.5和ChatGPT-4而言,自我校正后的总评分显著高于预校正的总评分;而对于Google Gemini而言,其自我校正的回答并未显示与之前的回答有显著差异。

    结论: 我们的研究结果显示,Google Gemini提供的答案比ChatGPT-3.5和ChatGPT-4更加简洁直观。在回答GIOP的一般问题以及2022 ACR-GIOP指南相关的问题方面,ChatGPT-4的表现明显优于其他两个模型(ChatGPT-3.5和Google Gemini)。此外,在自我校正能力上,ChatGPT-3.5和ChatGPT-4表现出比Google Gemini更强的能力。

    关键词: AI;ChatGPT;糖皮质激素诱发性骨质疏松症;Google Gemini;大型语言模型

    关键词:大型语言模型; ChatGPT; Google Gemini

    翻译效果不满意? 用Ai改进或 寻求AI助手帮助 ,对摘要进行重点提炼
    Copyright © Journal of orthopaedic surgery and research. 中文内容为AI机器翻译,仅供参考!

    相关内容

    期刊名:Journal of orthopaedic surgery and research

    缩写:J ORTHOP SURG RES

    ISSN:1749-799X

    e-ISSN:

    IF/分区:2.8/Q2

    文章目录 更多期刊信息

    全文链接
    引文链接
    复制
    已复制!
    推荐内容
    Comparative performance analysis of large language models: ChatGPT-3.5, ChatGPT-4 and Google Gemini in glucocorticoid-induced osteoporosis