首页 正文

BMC medical informatics and decision making. 2025 Jul 1;25(1):234. doi: 10.1186/s12911-025-03059-8 Q23.82025

Development of an ensemble prediction model for acute graft-versus-host disease in allogeneic transplantation based on machine learning

基于机器学习的异基因移植急性移植物抗宿主病列队预测模型构建 翻译改进

Lin Song  1, Xingwei Wu  2, Mengjia Xu  1  3, Ling Xue  1, Xun Yu  1, Zongqi Cheng  1, Chenrong Huang  4, Liyan Miao  5  6  7

作者单位 +展开

作者单位

  • 1 Department of Pharmacy, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China.
  • 2 Department of Pharmacy, Personalized Drug Therapy Key Laboratory of Sichuan Province, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, China.
  • 3 College of Pharmaceutical Sciences, Soochow University, Suzhou, 215006, 215006, China.
  • 4 Department of Pharmacy, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China. huangchenrong@suda.edu.cn.
  • 5 Department of Pharmacy, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China. miaolysuzhou@163.com.
  • 6 Institute for Interdisciplinary Drug Research and Translational Sciences, Soochow University, Suzhou, 215006, China. miaolysuzhou@163.com.
  • 7 College of Pharmaceutical Sciences, Soochow University, Suzhou, 215006, 215006, China. miaolysuzhou@163.com.
  • DOI: 10.1186/s12911-025-03059-8 PMID: 40597996

    摘要 中英对照阅读

    Background: Acute graft-versus-host disease (aGVHD) is a major post-transplantation complication and one of the most significant causes of non-relapse-related death. However, the massive and complex clinical data make aGVHD difficult to predict. Machine learning (ML), a branch of artificial intelligence, has since been introduced in medicine due to its ability to process complex, high-dimensional variables quickly and capture nonlinear relationships. However, the effects of immunosuppressants exposure was not considered in previous ML models. Thus, the purpose of this study was to develop and optimize models by Cox regression and machine learning algorithms to predict the risk of aGVHD in which cyclosporin A exposure and common clinical factors were included as variables.

    Methods: The data was preprocessed in the first step, and was randomly allocated at an 8:2 ratio. Cox regression model was constructed on the training set. Meanwhile, correlation analysis and recursive feature elimination were used for feature screening before machine learning model development. Then fifteen algorithms were used to establish models, and an ensemble model was established through soft voting based on the top five performance algorithms. Area under curve (AUC) was the main metric used to evaluate the model performance in the validation set, while nomogram and SHAP were applied to interpret the variables.

    Result: A total of 479 patients and 47 variables were included in the study. The incidence of grade II-IV aGVHD was 33.61%. The AUC of Cox regression model in the validation set was 0.625. In contrast, the new ensemble model has a better prediction ability (AUC = 0.776, Accuracy = 0.729, Precision = 0.667, Recall = 0.375, F1-score = 0.480). Except for the variables which were identified by previous studies, some rarely reported risk factors were found, such as quinolone, blood urea nitrogen and alkaline phosphatase.

    Conclusions: In summary, a new ensemble model with promising accuracy was established to predict grade II-IV classic aGVHD in allo-HSCT patients. It will help identify high-risk patients at an early stage and thus reduce the incidence of aGVHD.

    Clinical trial number: Not applicable.

    Keywords: Acute graft-versus-host disease; Allogeneic haematopoietic stem cell transplantation; Ensemble model; Machine learning; Prediction model.

    Keywords:allogeneic transplantation; machine learning

    背景: 急性移植物抗宿主病(aGVHD)是移植后的重大并发症,也是非复发相关死亡的主要原因之一。然而,庞大的和复杂的临床数据使得预测 aGVHD 变得困难。作为人工智能的一个分支,机器学习由于其快速处理复杂、高维变量的能力以及捕捉非线性关系的能力,在医学领域得到了应用。但是,之前的应用并未考虑免疫抑制剂暴露的影响。因此,本研究旨在通过 Cox 回归和机器学习算法开发并优化模型以预测 aGVHD 风险,并将环孢素 A 暴露及常见临床因素作为变量。

    方法: 数据在第一步进行预处理,并按 8:2 的比例随机分配。Cox 回归模型在训练集中构建,同时在机器学习模型开发前使用相关性分析和递归特征消除法筛选特征。然后利用十五种算法建立模型并通过基于五种性能最佳的算法软投票来建立集成模型。曲线下面积(AUC)是验证集上评估模型性能的主要指标,而 nomogram 和 SHAP 用于解释变量。

    结果: 研究共纳入了 479 名患者和 47 个变量。II-IV 级 aGVHD 的发生率为 33.61%。Cox 回归模型在验证集中的 AUC 值为 0.625。相比之下,新的集成模型具有更好的预测能力(AUC = 0.776,准确率 = 0.729,精确度 = 0.667,召回率 = 0.375,F1 分数 = 0.480)。除了之前研究中已确定的变量外,还发现了一些罕见的风险因素,例如喹诺酮类、血尿素氮和碱性磷酸酶。

    结论: 总之,本研究建立了一个新的集成模型,可以准确预测异基因造血干细胞移植患者 II-IV 级经典急性移植物抗宿主病。这将有助于早期识别高风险患者并降低 aGVHD 的发生率。

    临床试验编号: 不适用。

    关键词: 急性移植物抗宿主病;异基因造血干细胞移植;集成模型;机器学习;预测模型。

    关键词:急性移植物抗宿主病; 同种异体移植; 机器学习

    翻译效果不满意? 用Ai改进或 寻求AI助手帮助 ,对摘要进行重点提炼
    Copyright © BMC medical informatics and decision making. 中文内容为AI机器翻译,仅供参考!

    相关内容

    期刊名:Bmc medical informatics and decision making

    缩写:BMC MED INFORM DECIS

    ISSN:N/A

    e-ISSN:1472-6947

    IF/分区:3.8/Q2

    文章目录 更多期刊信息

    全文链接
    引文链接
    复制
    已复制!
    推荐内容
    Development of an ensemble prediction model for acute graft-versus-host disease in allogeneic transplantation based on machine learning