Chinese Journal of Stroke ›› 2023, Vol. 18 ›› Issue (05): 547-555.DOI: 10.3969/j.issn.1673-5765.2023.05.009

Previous Articles     Next Articles

Study on Predictive Model of Cerebral Hemorrhage during Hospitalization in Patients with Acute Ischemic Stroke Treated with rt-PA Intravenous Thrombolysis

  

  • Received:2023-02-27 Online:2023-05-20 Published:2023-05-20

急性缺血性卒中患者行阿替普酶静脉溶栓治疗住院期间脑出血预测模型研究

陈慧, 陈思玎, 朱之恺, 俞蔚然, 姜勇, 王拥军   

  1. 1 北京 100070 首都医科大学附属北京天坛医院;国家神经系统疾病临床医学研究中心
    2 北京大数据精准医疗高精尖创新中心(北京航空航天大学&首都医科大学)
  • 通讯作者: 姜勇 jiangyong@ncrcnd.org.cn 王拥军 yongjunwang111@aliyun.com

Abstract: Objective  To predict the incidence of cerebral hemorrhage in patients with acute ischemic stroke (AIS) or TIA undergoing intravenous thrombolytic therapy with rt-PA based on machine learning algorithm, and explore the risk factors affecting the incidence of cerebral hemorrhage after rt-PA thrombolytic therapy.  
Methods  A total of 74 654 patients who were initially diagnosed as AIS or TIA and received rt-PA intravenous thrombolytic therapy were enrolled in the Chinese Stroke Center Alliance (CSCA) from January 2016 to December 2020, with an average age of (65.55±12.14) years. Among them, 48 493 were male patients, accounting for 64.96%, and 2038 were cerebral hemorrhage patients during hospitalization, accounting for 2.73%. The data is divided into training sets and test sets by year, that is, the registered patients in 2016—2019 are divided into training sets, and the registered patients in 2020 are divided into test sets. The positive and negative samples of the training set data are 77∶100 balanced using the prototype selection down-sampling technology, and the logistic regression, extreme gradient boosting (XGBoost), random forest, gradient boosting decision tree (GBDT) and categorical boosting (CatBoost) are five models to predict the outcome of intracerebral hemorrhage, and use AUC, sensitivity, specificity, Brier score and other indicators to evaluate and compare the prediction effect of the model, and use SHAP chart to analyze the interpretability of the features screened by the machine learning model. 
Results  The AUC values of XGBoost, GBDT, CatBoost, logistic regression and random forest were 0.770 (95%CI 0.745-0.774), 0.766 (95%CI 0.753-0.786), 0.765 (95%CI 0.752-0.766), 0.758 (95%CI 0.747-0.761) and 0.757 (95%CI 0.739-0.759) respectively, the sensitivity was 0.624 (95%CI 0.574-0.672), 0.606 (95%CI 0.555-0.655), 0.570 (95%CI 0.519-0.620), 0.557 (95%CI 0.506-0.607) and 0.585 (95%CI 0.534-0.635) respectively, the specificity was 0.780 (95%CI 0.773-0.786), 0.785 (95%CI 0.778-0.791), 0.790 (95%CI 0.783-0.796), 0.805 (95%CI 0.799-0.811) and 0.769 (95%CI 0.762-0.776) respectively, and the Brier score was 0.157, 0.154, 0.156, 0.160 and 0.161 respectively. Through the SHAP diagram, we found that the characteristics of high NIHSS score, old age, high fasting blood glucose level, history of atrial fibrillation, low platelet count, long time window between onset and thrombolytic therapy, low BMI, and high NIHSS score at the time of visit were risk factors affecting the incidence of cerebral hemorrhage during the hospitalization of rt-PA thrombolytic therapy.  
Conclusions  The prediction model based on machine learning can predict the occurrence of intracerebral hemorrhage in AIS patients undergoing rt-PA thrombolytic therapy during hospitalization. This study has certain exploration value for the application of machine learning technology in the field of intracerebral hemorrhage prediction in the future.

Key words: Acute ischemic stroke; Rt-PA; Thrombolysis; Cerebral hemorrhage; Prediction model; Machine learning 

摘要: 目的 基于机器学习算法对急性缺血性卒中(acute ischemic stroke,AIS)或TIA患者行rt-PA静脉溶栓治疗住院期间脑出血的发生情况进行预测,并探索影响rt-PA溶栓治疗后脑出血发生的危险因素。
方法 纳入中国卒中中心联盟(Chinese Stroke Center Alliance,CSCA)2016年1月—2020年12月登记的被初步诊断为AIS或TIA且接受rt-PA静脉溶栓治疗的患者74 654例,平均年龄为(65.55±12.14)岁,其中,男性患者48 493例(64.96%),住院期间发生脑出血患者2038例(2.73%)。将数据按年份划分为训练集和测试集,即2016—2019年登记患者划分为训练集,2020年登记患者划分为测试集,采用原型选择下采样技术对训练集数据正负样本进行77∶100平衡处理,构建了逻辑回归、极致梯度提升(extreme gradient boosting,XGBoost)、随机森林、梯度提升决策树(gradient boosting decision tree,GBDT)和分类梯度提升(categorical boosting,CatBoost)共5个模型对脑出血结局进行预测,并使用AUC、灵敏度、特异度、Brier评分等指标对模型预测效果进行评价和比较,采用SHAP图对机器学习模型筛选出的特征进行可解释性分析。 
结果 XGBoost、GBDT、CatBoost、逻辑回归和随机森林模型的AUC值分别为0.770(95%CI 0.745~0.774)、0.766(95%CI 0.753~0.786)、0.765(95%CI 0.752~0.766)、0.758(95%CI 0.747~0.761)和0.757(95%CI 0.739~0.759),灵敏度分别为0.624(95%CI 0.574~0.672)、0.606(95%CI 0.555~0.655)、0.570(95%CI 0.519~0.620)、0.557(95%CI 0.506~0.607)和0.585(95%CI 0.534~0.635),特异度分别为0.780(95%CI 0.773~0.786)、0.785(95%CI 0.778~0.791)、0.790(95%CI 0.783~0.796)、0.805(95%CI 0.799~0.811)和0.769(95%CI 0.762~0.776),Brier评分分别为0.157、0.154、0.156、0.160和0.161分。通过SHAP图解释结果发现,住院NIHSS评分高、年龄大、空腹血糖水平高、既往心房颤动病史、血小板计数低、发病距溶栓治疗时间窗长、BMI低、就诊时NIHSS评分高等特征为rt-PA溶栓治疗住院期间发生脑出血的危险因素。
结论 基于机器学习构建的预测模型对行rt-PA静脉溶栓治疗的AIS患者住院期间脑出血的发生具有一定预测效果,本研究对未来机器学习技术在脑出血预测领域的应用有一定探索价值。

关键词: 急性缺血性卒中; 阿替普酶; 溶栓; 脑出血; 预测模型; 机器学习