机器学习在肾移植受者他克莫司剂量预测中的应用

闵建亮; 陈国栋

doi:10.12464/j.issn.1674-7445.2025215

摘要:

目的基于机器学习算法，探讨2种预测模型在肾移植受者他克莫司初始和后续剂量的预测价值。

方法回顾性分析2015年1月至2019年4月期间中山大学附属第一医院1 013例中国肾移植受者的病历资料，重点关注成年肾移植受者的初始和后续剂量，前者收集每例受者33个变量，后者收集26个变量。利用遗传算法结合随机重启爬坡算法通过多数投票确定少数关键临床变量，并进一步剔除Lasso回归变量系数小于最优变量系数阈值的变量。基于结构化表格数据，将选择的少数临床变量输入级联深度森林（CDF）和TabNet深度神经网络中进行分析和比较，并使用留一受试者法进行检验。

结果训练集共纳入613例受者数据，而外部验证集有116例受者。他克莫司初始剂量算法中最终纳入的临床变量有目标浓度、目标浓度距离手术的时间、体质量、性别、手术类型、首次服药距离手术的时间、五酯胶囊、钙通道阻滞剂、肌酐、血红蛋白和CYP3A5，而后续剂量算法中最终纳入的临床变量有目标浓度、目标浓度距离手术的时间、五酯胶囊、肌酐、丙氨酸转氨酶、天冬氨酸转氨酶、上次剂量、上次剂量对应的浓度、上次浓度距离手术的时间。基于上述变量，TabNet模型比CDF模型表现出更佳的预测性能：在初始剂量预测中，预测剂量与实际剂量的误差在±20%范围内的准确率为0.801，且拟合指标R²为0.436；在后续剂量预测中，对应的准确率和R²分别为0.939和0.902。选择变量特征贡献的结果显示，CYP3A5和目标浓度对预测初始剂量贡献最大，而上次剂量及其对应浓度对预测后续剂量贡献最大。此外，独立外部验证结果亦表现良好。

结论调优后的TabNet预测模型可为临床实践中基于机器学习算法的药物剂量预测提供重要参考。

Abstract:

Objective To explore the predictive value of two models based on machine learning algorithms in predicting the initial and subsequent doses of tacrolimus in kidney transplant recipients.

Methods A retrospective analysis was conducted on the medical records of 1 013 Chinese kidney transplant recipients at the First Affiliated Hospital of Sun Yat-sen University from January 2015 to April 2019, focusing on the initial and subsequent doses in kidney transplant recipients. Thirty-three variables were collected for the initial dose, and twenty-six variables for the subsequent dose. A genetic algorithm combined with a random-restart hill-climbing algorithm was used to determine a small number of key clinical variables through majority voting, and variables with Lasso regression coefficients less than the optimal variable coefficient threshold were further eliminated. The selected clinical variables were input into a cascaded deep forest (CDF) and TabNet deep neural network for analysis and comparison based on structured tabular data, and the leave-one-subject-out method was used for validation.

Results A total of 613 recipients were included in the training set, and 116 recipients were in the external validation set. In the initial dose algorithm of tacrolimus, the clinical variables ultimately included target concentration, time from surgery to target concentration, body weight, gender, type of surgery, time from surgery to first dose, WuZhi capsule, calcium channel blocker, creatinine, hemoglobin and CYP3A5. In the subsequent dose algorithm, the clinical variables ultimately included target concentration, time from surgery to target concentration, WuZhi capsule, creatinine, alanine aminotransferase, aspartate aminotransferase, previous dose, previous dose concentration and time from surgery to previous concentration. Based on the above variables, the TabNet model showed better predictive performance than the CDF model: in the initial dose prediction, the accuracy of the predicted dose within ±20% of the actual dose was 0.801, and the fitting index R² was 0.436; in the subsequent dose prediction, the corresponding accuracy and R² were 0.939 and 0.902, respectively. The results of feature contribution showed that CYP3A5 and target concentration contributed the most to the prediction of initial dose, while previous dose and its corresponding concentration had the greatest impact on subsequent dose prediction. In addition, the results of independent external validation were also satisfactory.

Conclusions The optimized TabNet predictive model may provide important reference for drug dose prediction based on machine learning algorithms in clinical practice.

机器学习在肾移植受者他克莫司剂量预测中的应用

Application of machine learning in tacrolimus dose prediction for kidney transplant recipients