首页 正文

Healthcare (Basel, Switzerland). 2025 May 27;13(11):1271. doi: 10.3390/healthcare13111271 Q22.72025

AI Chatbots in Pediatric Orthopedics: How Accurate Are Their Answers to Parents' Questions on Bowlegs and Knock Knees?

儿童骨科中的AI聊天机器人:它们对家长关于生理性 bowed legs(O 形腿)和 knock knees(X 形腿)问题的回答准确度如何? 翻译改进

Ahmed Hassan Kamal  1

作者单位 +展开

作者单位

  • 1 Division of Orthopedics, Department of Surgery, College of Medicine, King Faisal University, Al-Ahsa 31982, Saudi Arabia.
  • DOI: 10.3390/healthcare13111271 PMID: 40508883

    摘要 中英对照阅读

    Background/objectives: Large-language modules facilitate accessing health information instantaneously. However, they do not provide the same level of accuracy or detail. In pediatric orthopedics, where parents have urgent concerns regarding knee deformities (bowlegs and knock knees), the accuracy and dependability of these chatbots can affect parent decisions to seek treatment. The goal of this study was to analyze how AI chatbots addressed parental concerns regarding pediatric knee deformities.

    Methods: A set of twenty standardized questions, consisting of ten questions each on bowlegs and knock knees, were designed through literature reviews and through analysis of parental discussion forums and expert consultations. Each of the three chatbots (ChatGPT, Gemini, and Copilot) was asked the same set of questions. Five pediatric orthopedic surgeons were then asked to rate each response for accuracy, clarity, and comprehensiveness, along with the degree of misleading information provided, on a scale of 1-5. The reliability among raters was calculated using intraclass correlation coefficients (ICCs), while differences among the chatbots were assessed using a Kruskal-Wallis test with post hoc pairwise comparisons.

    Results: All three chatbots displayed a moderate-to-good score for inter-rater reliability. ChatGPT and Gemini's scores were higher for accuracy and comprehensiveness than Copilot's (p < 0.05). However, no notable differences were found in clarity or in the likelihood of giving incorrect answers. Overall, more detailed and precise responses were given by ChatGPT and Gemini, while, with regard to clarity, Copilot performed comparably but was less thorough.

    Conclusions: There were notable discrepancies in performance across the AI chatbots in providing pediatric orthopedic information, which demonstrates indications of evolving potential. In comparison to Copilot, ChatGPT and Gemini were relatively more accurate and comprehensive. These results highlight the persistent requirement for real-time supervision and stringent validation when employing chatbots in the context of pediatric healthcare.

    Keywords: AI chatbots; health information accuracy; knee deformities; parental concerns.

    Keywords:AI chatbots; pediatric orthopedics; parents' questions; bowlegs; knock knees

    背景/目的: 大型语言模块能够即时访问健康信息。然而,它们提供的准确性和细节程度不如专业医疗建议。在小儿骨科领域,当家长对膝盖畸形(O型腿和X型腿)有紧急关注时,这些聊天机器人的准确性和可靠性会影响家长寻求治疗的决定。本研究旨在分析人工智能聊天机器人如何应对有关儿童膝关节畸形的家长担忧。

    方法: 通过文献回顾、家长讨论论坛分析和专家咨询设计了一组20个标准化问题,每个类型(O型腿和X型腿)各有10个问题。三个聊天机器人(ChatGPT、Gemini 和 Copilot)都回答了这些问题的同一套题目。五位小儿骨科医生被要求对每份答复在准确度、清晰度和全面性方面的评分进行评定,分数范围为 1-5,并评估提供的误导信息的程度。使用组内相关系数(ICCs)计算评分者间的一致性,同时通过Kruskal-Wallis检验结合事后成对比较来评估不同聊天机器人之间的差异。

    结果: 所有三个聊天机器人的评分者一致性评分为中等到良好。ChatGPT 和 Gemini 的准确性和全面性的得分比 Copilot 高(p

    结论: 在提供小儿骨科信息时,不同的人工智能聊天机器人之间存在显著的性能差异,这表明其潜在能力正在不断发展。与 Copilot 相比,ChatGPT 和 Gemini 在准确性和全面性方面更为可靠。这些结果突显了在儿科医疗环境下使用聊天机器人的实时监管和严格验证的需求。

    关键词: AI 聊天机器人;健康信息准确性;膝关节畸形;家长担忧。

    关键词:人工智能聊天机器人; 儿科骨科; 家长问题; X形腿; O形腿

    翻译效果不满意? 用Ai改进或 寻求AI助手帮助 ,对摘要进行重点提炼
    Copyright © Healthcare (Basel, Switzerland). 中文内容为AI机器翻译,仅供参考!

    相关内容

    期刊名:Healthcare

    缩写:

    ISSN:N/A

    e-ISSN:2227-9032

    IF/分区:2.7/Q2

    文章目录 更多期刊信息

    全文链接
    引文链接
    复制
    已复制!
    推荐内容
    AI Chatbots in Pediatric Orthopedics: How Accurate Are Their Answers to Parents' Questions on Bowlegs and Knock Knees?