您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(医学版)》

山东大学学报 (医学版) ›› 2020, Vol. 58 ›› Issue (1): 20-25.doi: 10.6040/j.issn.1671-7554.0.2019.1009

• 临床医学 • 上一篇    

血常规数据判别骨髓增生异常综合征和急性髓样白血病的应用价值

代晓宇1,2,路媛1,2,王志恒1,2,李明卓1,2,司书成1,2,李吉庆1,2, 井明2, 薛付忠1,2   

  1. 1.山东大学公共卫生学院生物统计学系, 山东 济南 250012;2.山东大学健康医疗大数据研究院, 山东 济南 250002
  • 发布日期:2022-09-27
  • 通讯作者: 薛付忠. E-mail:xuefzh @sdu.edu.cn
  • 基金资助:
    国家自然科学基金(81170352)

Value of blood routine data in distinguishing myelodysplastic syndrome and acute myeloid leukemia

DAI Xiaoyu1,2, LU Yuan1,2, WANG Zhiheng1,2, LI Mingzhuo1,2, SI Shucheng1,2, LI Jiqing1,2, JING Ming2, XUE Fuzhong1,2   

  1. 1. Department of Biostatistics, School of Public Health, Shandong University, Jinan 250012, Shandong, China;
    2. Institute for Medical Detaology, Shandong University, Jinan 250002, Shandong, China
  • Published:2022-09-27

摘要: 目的 基于血常规数据构建骨髓增生异常综合征和急性髓样白血病的判别模型。 方法 数据来源于山东多中心健康医疗大数据平台,共计1 681例。随机抽取70%患者为训练集,其他30%为测试集,应用随机森林模型对骨髓增生异常综合征与急性髓样白血病判别,采用受试者工作特征曲线下面积(AUC)衡量模型的辨别能力并使用十折交叉验证法检验模型的稳定性。 结果 随机森林模型与支持向量机模型均具有鉴别MDS与AML的能力,但随机森林模型表现效果更好,男性判别模型的AUC为0.874(95%CI:0.815~0.932),灵敏度和特异度分别为81.1%、81.9%;女性判别模型的AUC为0.831(95%CI:0.752~0.911),灵敏度和特异度分别为77.8%、74.3%。十折交叉验证的结果显示,男性AUC为0.884(95%CI:0.854~0.913),女性AUC为0.842(95%CI:0.802~0.883)。 结论 构建的随机森林模型在骨髓增生异常综合征和急性髓系白血病患者中具有较好的判别能力。

关键词: 血常规, 随机森林, 骨髓增生异常综合征, 急性髓样白血病, ROC曲线

Abstract: Objective To establish a model to identify myelodysplastic syndrome(MDS)and acute myeloid leukemia(AML)based on blood routine data. Methods The data of 1 681 patients from Shandong Multi-Center Health Medical Big Data Platform were randomly divided into the training set(70%)and testing set(30%). MDS and AML were identified with random forest model. The discriminatory ability of the model was determined with the area under the receiver operating characteristic curve(AUC)and the stability of the model was tested with ten-fold cross validation. Results Both of the random forest model and support vector machine model were able to identify MDS and AML, but the former had better performance. It showed the estimated AUC for male was 0.874(95%CI: 0.815-0.932), sensitivity was 81.1%, and specificity was 81.9%; the estimated AUC for female was 0.831(95%CI: 0.752-0.911), sensitivity was 77.8%, and specificity was 74.3%. The ten-fold cross validation showed the estimated AUC was 0.884(95%CI: 0.854-0.913)for male and 0.842(95%CI: 0.802-0.883)for female. Conclusion The discriminant model is capable of identifying MDS and AML based on blood routine data.

Key words: Blood routine, Random forest, Myelodysplastic syndrome, Acute myeloid leukemia, ROC curve

中图分类号: 

  • R551.3
[1] Sultan C, Imbert M, Jouault H, et al. Myelodysplastic Syndromes[J]. Acta Haematologica, 1987, 78(1): 91-93.
[2] Steensma DP, Bennett JM. The Myelodysplastic Syndromes: Diagnosis and Treatment[J]. Mayo Clinic Proceedings, 2006, 81(1): 104-130.
[3] Miyazato A. Identification of myelodysplastic syndrome-specific genes by DNA microarray analysis with purified hematopoietic stem cell fraction[J]. Blood, 2001, 98(2): 422-427.
[4] Shukron O, Vainstein V, Kündgen A, et al. Analyzing transformation of myelodysplastic syndrome to secondary acute myeloid leukemia using a large patient database[J]. Am J Hematol, 2012, 87(9): 853-860.
[5] Song X, Peng Y, Wang X, et al. Incidence, survival, and risk factors for adults with acute myeloid leukemia not otherwise specified and acute myeloid leukemia with recurrent genetic abnormalities: analysis of the surveillance, epidemiology, and end results(SEER)database, 2001—2013[J]. Acta Haematologica, 2018, 139(2): 115-127.
[6] 王英俏. 急性髓系白血病发病因素及慢性髓性白血病TKIs早期疗效评估的探讨[D]. 济南: 山东大学, 2016.
[7] 陈影. 流式细胞术检测急性髓细胞白血病免疫表型及其临床意义[D]. 合肥: 安徽医科大学, 2014.
[8] 李国, 宋英儒, 李新文, 等. 骨髓异常增生综合征与急性髓性白血病的MRI表现[J]. 中国医学影像技术, 2010, 26(7): 1296-1299. LI Guo, SONG Yingru, LI Xinwen, et al. Bone marrow MRI of myelodysplastic syndromes and acute myeloid leukemia[J]. Chinese Journal of Imaging Technology, 2010, 26(7): 1296-1299.
[9] Mills KI, Kohlmann A, Williams PM, et al. Microarray-based classifiers and prognosis models identify subgroups with distinct clinical outcomes and high risk of AML transformation of myelodysplastic syndrome[J]. Blood, 2009, 114(5): 1063-1072.
[10] 中华医学会血液学分会白血病淋巴瘤学组. 成人急性髓系白血病(非急性早幼粒细胞白血病)中国诊疗指南(2017年版)[J]. 中华血液学杂志, 2017, 38(3): 177-182. Leukemia & Lymphoma Group, Chinese Society of Hematology, Chinese Medical Association. Chinese guidelines for diagnosis and treatment of adult acute myeloid leukemia(not APL)(2017)[J]. Chinese Journal of Hematology, 2017, 38(3): 177-182.
[11] Arber DA, Orazi A, Hasserjian R, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia[J]. Blood, 2016, 127(20): 2391-2405.
[12] 吴德沛, 肖志坚, 黄晓军. 骨髓增生异常综合征中国诊断与治疗指南(2019年版)[J]. 中华血液学杂志, 2019, 40(2): 89-97. WU Depei, XIAO Zhijian, HUANG Xiaojun. Chinese guidelines for diagnosis and treatment of myelodysplastic syndromes(2019)[J]. Chinese Journal of Hematology, 2019, 40(2): 89-97.
[13] 詹曙, 姚尧, 高贺. 基于随机森林的脑磁共振图像分类[J]. 电子测量与仪器学报, 2013, 27(11): 1067-1072. ZHAN Shu, YAO Yao, GAO He. Magnetic resonance image classification of brain based on random forest[J]. Journal of Electronic Measurement and Instrument, 2013, 27(11): 1067-1072.
[14] Cerrada M, Zurita G, Cabrera D, et al. Fault diagnosis in spur gears based on genetic algorithm and random forest[J]. Mech Syst Signal Process, 2016, 70-71: 87-103. doi:10.1016/j.ymssp.
[15] Hariharan S, Tirodkar S, Bhattacharya A. Polarimetric SAR decomposition parameter subset selection and their optimal dynamic range evaluation for urban area classification using Random Forest[J]. Int J Appl Earth Obs Geoinfor, 2016, 44: 144-158. doi: 10.1016/j.jag.
[16] Hu J, Li Y, Yang JY, et al. GPCR-drug interactions prediction using random forest with drug-association-matrix-based post-processing procedure[J]. Comput Biol Chem, 2016, 60: 59-71. doi: 10.1016/j.compbiolchem.
[17] Breiman L. Random Forests[J]. Machine Learning, 2001, 45(1): 5-32.
[18] Johnson RW. An Introduction to the Bootstrap[J]. Teaching Statistics, 2010, 23(2): 49-54.
[19] 方匡南, 吴见彬, 朱建平, 等. 随机森林方法研究综述[J]. 统计与信息论坛, 2011, 26(3): 32-38.
[20] 董师师, 黄哲学. 随机森林理论浅析[J]. 集成技术, 2013, 2(1): 1-7. DONG Shishi, HUANG Zhexue. A brief theoretical overview of random forests[J]. Journal of integration technology, 2013, 2(1): 1-7.
[21] Vapnik VN. The Nature of Statistical Learning Theory[M]. Berlin: Springer, 2000.
[22] 李磊, 黄水平. 支持向量机原理及其在医学分类中的应用[J]. 中国卫生统计, 2009, 26(1): 22-25.
[23] Metz CE. Basic Principles of ROC analysis[J]. Semin Nucl Med, 1978, 8(4): 283-298.
[24] Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves[J]. Bmc Bioinformatics, 2011, 12(1): 1-8.
[25] Fan RE, Chen PH, Lin CJ, et al. Working set selection using second order information for training support vector machines[J]. J Mach Learn Res, 2005, 6(4): 1889-1918.
[26] Wang W, Wang H, Wang XQ, et al. First report of incidence of adult myelodysplastic syndrome in China[J]. Ann Hematol, 2012, 91(8): 1321-1322.
[27] Shysh AC, Nguyen LT, Guo M, et al. The incidence of acute myeloid leukemia in Calgary, Alberta, Canada: a retrospective cohort study[J]. BMC Public Health, 2018, 18(1): 94.
[28] Heaney ML, Golde DW. Myelodysplasia[J]. N Engl J Med, 1999, 340(21): 1649-1660.
[1] 窦春慧,邵建华,董学斌,张凌,陈萍,赵红玉,顾琳萍,孙琳,解杰,王敏,王娟,李娜,李凡,李大启. 骨髓增生异常综合征患者基因突变对地西他滨临床疗效的影响[J]. 山东大学学报 (医学版), 2019, 57(3): 42-48.
[2] 公晓云,申小涛,徐静,张涛,朱正江,薛付忠. 代谢组学数据正态性对疾病分类准确性的影响[J]. 山东大学学报(医学版), 2016, 54(4): 89-93.
[3] 张志强,吴艳华,王茂水,王新锋,汪运山. 联合检测血清PCT和NSE对小细胞肺癌的诊断价值[J]. 山东大学学报(医学版), 2016, 54(11): 36-39.
[4] 刘盈君, 张涛, 王璐, 刘佳, 常学润, 张敬悬, 薛付忠. 基于随机森林的精神分裂症血清代谢组学研究[J]. 山东大学学报(医学版), 2015, 53(2): 92-96.
[5] 刘新农,展翰翔,胡三元,张光永, 王磊. 血清miR-146a在脓毒症早期诊断中的价值[J]. 山东大学学报(医学版), 2013, 51(9): 92-94.
[6] 徐佳, 宋强 . 骨髓增生异常综合征患者RASSF1A基因启动子区甲基化及基因表达的缺失[J]. 山东大学学报(医学版), 2013, 51(2): 65-69.
[7] 韩艳鑫,丛雅琴,王志敏. 骨髓增生异常综合征患者骨髓间充质干细胞的免疫抑制作用[J]. 山东大学学报(医学版), 2012, 50(6): 97-.
[8] 尹冬梅1,许洪志1,张婧瑶1,隋潇徽1,崔彬2,马春燕2,甄长青1. AA、MDS和AML患者CD4+T细胞亚群的变化及其临床意义[J]. 山东大学学报(医学版), 2012, 50(3): 66-70.
[9] 吴庆忠1,车峰远2,薛付忠1. 基于非平衡数据的癫痫发作预警模型研究[J]. 山东大学学报(医学版), 2012, 50(2): 141-.
[10] 杨玉海,邵广瑞,宋磊,张伟. ROC分析在多层螺旋CT综合性诊断孤立性肺结节良恶性中的应用[J]. 山东大学学报(医学版), 2011, 49(4): 90-94.
[11] 王睿婕1,许洪志1,黄敏3,马春燕2,隋潇徽1,刘新1,张炳昌3,李元堂3. MDS、AA和AL患者骨髓细胞周期及增殖特征的研究[J]. 山东大学学报(医学版), 2011, 49(2): 97-101.
[12] 赵婷,宋强,李丽珍,王鲁群,赵川莉. 骨髓增生异常综合征患者EPO水平的测定及其受体的表达[J]. 山东大学学报(医学版), 2011, 49(1): 67-70.
[13] 柴静1,蒋萍2,钱雪娇1. 呼出气一氧化氮检测对支气管哮喘的诊断价值[J]. 山东大学学报(医学版), 2010, 48(10): 81-84.
[14] 赵广玲1 ,许洪志1 ,黄敏1 ,李元堂2 ,武焕玲2 ,李建峰3 ,马春燕3
. 骨髓增生异常综合征和再生障碍性贫血患者
骨髓原始细胞免疫表型分析
[J]. 山东大学学报(医学版), 2009, 47(02): 53-57.
[15] 靳 红,丛雅琴,胡晓静,姜育杰 . 骨髓增生异常综合征患者线粒体DNA D-loop区突变研究[J]. 山东大学学报(医学版), 2008, 46(5): 453-456.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!