您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(医学版)》

山东大学学报 (医学版) ›› 2023, Vol. 61 ›› Issue (4): 86-94.doi: 10.6040/j.issn.1671-7554.0.2022.0927

• 公共卫生与管理学 • 上一篇    下一篇

基于贝叶斯网络不确定性推理的肺癌风险预测模型

钟璐1,2,薛付忠1,2   

  1. 1.山东大学齐鲁医学院公共卫生学院生物统计学系, 山东 济南 250012;2. 山东大学健康医疗大数据研究院, 山东 济南 250002
  • 发布日期:2023-04-11
  • 通讯作者: 薛付忠. E-mail:xuefzh@sdu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2020YFC2003500);山东省重点研发计划(科技示范工程)项目(2021SFGC0504)

A Lung cancer risk prediction model based on Bayesian network uncertainty inference

ZHONG Lu1,2, XUE Fuzhong1,2   

  1. 1. Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250012, Shandong, China;
    2. Institute for Medical Dataology, Shandong University, Jinan 250002, Shandong, China
  • Published:2023-04-11

摘要: 目的 将贝叶斯网络与Cox模型相结合,预测包含缺失协变量的个体的肺癌发病风险。 方法 研究使用的数据来自于英国生物样本库,采用单因素Cox回归分析筛选与肺癌发病相关的预测因素;基于识别出的肺癌潜在预测因素,应用上述联合模型建立个体化肺癌风险预测模型;从鉴别和校准两方面评价模型的预测性能。 结果 建立的预测模型具有较好的鉴别和校准能力,训练和验证队列的AUC分别为0.854(95%CI:0.836~0.870)和0.885(95%CI:0.871~0.897)。 结论 本研究构建了基于贝叶斯网络和Cox模型的肺癌风险预测模型;该模型具有良好的鉴别和校准能力,能有效预测肺癌发病高危人群;联合模型在存在缺失预测因子的情况下提供了一种有效的风险预测方法,可为肺癌预防控制提供理论支撑。

关键词: 肺癌, 风险预测模型, 贝叶斯网络, Cox模型, 缺失数据

Abstract: Objective To predict the risk of lung cancer in individuals with missing covariates by combining a Bayesian network with a Cox model. Methods Data were obtained from the UK Biobank. Predictors associated with lung cancer were screened with univariate Cox regression analysis. Based on the predictors identified, the individual risk prediction model of lung cancer was established. The identification and calibration of the model were determined to evaluate its predictive performance. Results The prediction model had good identification and calibration ability, and the AUC of the training and validation cohorts were 0.854(95%CI: 0.836-0.870)and 0.885(95%CI: 0.871-0.897), respectively. Conclusion A lung cancer risk prediction model based on Bayesian network and Cox model was constructed. The model has good identification and calibration ability, and can effectively predict the high-risk population of lung caner. Combined model provides an effective risk prediction method in the presence of missing predictors, which can provide theoretical reference for the prevention and control of lung cancer.

Key words: Lung cancer, Risk prediction model, Bayesian network, Cox model, Missing data

中图分类号: 

  • R730.1
[1] 邱海波, 曹素梅, 徐瑞华. 基于2020年全球流行病学数据分析中国癌症发病率、死亡率和负担的时间趋势及与美国和英国数据的比较[J]. 癌症, 2022, 41(4): 165-177. QIU Haibo, CAO Sumei, XU Ruihua. Temporal trends of cancer incidence, mortality and burden in China based on global epidemiological data in 2020 and comparison with data from the United States and the United Kingdom[J]. Cancer, 2022, 41(4): 165-177.
[2] De Groot PM, Wu CC, Carter BW, et al. The epidemiology of lung cancer[J]. Transl Lung Cancer Res, 2018, 7(3): 220-233.
[3] 李玲燕, 袁萍. 53例肺癌放疗患者生存质量分析[J]. 解放军预防医学杂志, 2018, 36(12): 1592-1596. LI Lingyan, YUAN Ping. Quality of life analysis of 53 patients with lung cancer radiotherapy[J]. PLA Preventive Medicine Journal, 2018, 36(12): 1592-1596.
[4] 袁冬梅, 宋勇. 非小细胞肺癌治疗新时代:免疫治疗[J]. 解放军医学杂志, 2017, 42(6): 483-487. YUAN Dongmei, SONG Yong. Non-small cell lung cancer: a new era of immunotherapy[J]. PLA Medical Journal, 2017, 42(6): 483-487.
[5] Cox DR. Regression models and life-tables[J]. J R STAT SOC: Series B(Methodological), 1972, 34(2): 187-202.
[6] 吕章艳, 谭锋维, 林春青, 等. 肺癌风险预测模型构建与验证的系统综述[J]. 中华预防医学杂志, 2020(4): 430-437. doi: 10.3760/cma.j.cn112150-20190523-00415. LYU Zhangyan, TAN Fengwei, LIN Chunqing, et al. Lung cancer risk prediction model construction and validation: a systematic review[J]. Chinese Journal of Preventive Medicine, 2020(4): 430-437. doi: 10.3760/cma.j.cn112150-20190523-00415.
[7] Burton A, Altman D. Missing covariate data within cancer prognostic studies: a review of current reporting and proposed guidelines[J]. Br J Cancer, 2004, 91(1): 4-8.
[8] Suthar B, Patel H, Goswami A. A survey: classification of imputation methods in data mining[J]. IJETAE, 2012, 2(1): 309-312.
[9] Van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis[J]. Stat Med, 1999, 18(6): 681-694.
[10] Waljee AK, Mukherjee A, Singal AG, et al. Comparison of imputation methods for missing laboratory data in medicine[J]. BMJ Open, 2013, 3(8): e002847. doi:10.1136/bmjopen-2013-002847.
[11] Jensen FV, Nielsen TD. Bayesian networks and decision graphs [M]. New York: Springer, 2007.
[12] Bycroft C, Freeman C, Petkova D, et al. The UK Biobank resource with deep phenotyping and genomic data[J]. Nature, 2018, 562(7726): 203-209.
[13] Wiseman M. The second World Cancer Research Fund/American Institute for Cancer Research expert report. Food, nutrition, physical activity, and the prevention of cancer: a global perspective[J]. Proc Nutr Soc, 2008, 67(3): 253-256.
[14] Biedermann A, Taroni F. Bayesian networks and probabilistic reasoning about scientific evidence when there is a lack of data[J]. Forensic Sci Int, 2006, 157(2-3): 163-167.
[15] Larranaga P, Karshenas H, Bielza C, et al. A review on evolutionary algorithms in Bayesian network learning and inference tasks[J]. INS, 2013, 233: 109-125. doi:10.1016/j.ins.2012.12.051.
[16] Glover F. Artificial intelligence, heuristic frameworks and tabu search[J]. Manag Decis Econ, 1990, 11(5): 365-375.
[17] Peng Y, Zhang S, Pan R. Bayesian network reasoning with uncertain evidences[J]. Int J Uncertain Fuzz, 2010, 18(5): 539-564.
[18] 于汉成, 张霁娟, 刘峰, 等. 基于纵向健康体检数据的高尿酸血症发病风险预测模型[J]. 现代预防医学, 2021, 48(23): 4408-4412. YU Hancheng, ZHANG Jijuan, LIU Feng, et al. Risk prediction model of hyperuricemia based on longitudinal health examination data[J]. Mod Prev Med, 2021, 48(23): 4408-4412.
[19] 李迪, 段书音, 何霞霞, 等. 基于logistic回归的肺癌危险度评价模型的构建[J]. 郑州大学学报(医学版), 2019, 54(6): 832-834. LI Di, DUAN Shuyin, HE Xiaxia, et al. Construction of risk assessment model for lung cancer based on logistic regression[J]. Journal of Zhengzhou University(Medical Sciences), 2019, 54(6): 832-834.
[20] Okeeffe LM, Taylor G, Huxley RR, et al. Smoking as a risk factor for lung cancer in women and men: a systematic review and meta-analysis[J]. BMJ Open, 2018, 8(10): e021611. doi:10.1136/bmjopen-2018-021611.
[21] 阮晓楠, 吕桦, 杨黎明, 等. 上海市浦东新区肺癌病例对照研究[J]. 中国肿瘤, 2011, 20(12): 885-888. RUAN Xiaonan, LYU Hua, YANG Liming, et al. Case control study of lung cancer in Pudong New Area of Shanghai[J]. China Cancer, 2011, 20(12): 885-888.
[22] 孙鑫, 覃文进, 刘理礼,等. 肺癌发病危险因素最新研究进展[J]. 现代肿瘤医学, 2022, 30(4): 753-756. SUN Xin, QIN Wenjin, LIU Lili, et al. Recent research progress in risk factors of lung cancer[J]. Journal of Modern Oncology, 2022, 30(4): 753-756.
[23] Huang Y, Zhu M, Ji M, et al. Air pollution, genetic factors, and the risk of lung cancer: a prospective study in the UK Biobank[J]. Am J Respir Crit Care Med, 2021, 204(7): 817-825.
[24] Biedermann A, Taroni F. Bayesian networks and probabilistic reasoning about scientific evidence when there is a lack of data[J]. Forensic Sci Int, 2006, 157(2-3): 163-167.
[25] Chen J, Zhang R, Dong X, et al. shinyBN: an online application for interactive Bayesian network inference and visualization[J]. BMC Bioinformatics, 2019, 20(1): 711. doi:10.1186/s12859-019-3309-0.
[1] 王诗健,郑逗逗,安序菊,杨楹. 童年期创伤与依恋对青少年躯体症状障碍影响的网络分析[J]. 山东大学学报 (医学版), 2026, 64(3): 45-54.
[2] 黄佩文, 王旭东. 治疗非小细胞肺癌新药:靶向c-Met蛋白的抗体药物偶联物Telisotuzumab Vedotin[J]. 山东大学学报 (医学版), 2026, 64(3): 124-130.
[3] 王建民,李晓峰,由志涛,董圣杰,赵宇驰,李占菊,邹德鑫,张剑锋,孙涛,杜伟. 基于可解释机器学习的后路腰椎椎体间融合术后慢性疼痛风险预测模型构建[J]. 山东大学学报 (医学版), 2026, 64(2): 78-88.
[4] 陈莹莹,王鲁,胡锡峰,朱高培,薛付忠. 基于贝叶斯网络的2型糖尿病患者并发脑卒中风险预测[J]. 山东大学学报 (医学版), 2025, 63(8): 94-102.
[5] 赵汉卿,周新睿,李子建,唐兴. 循环肿瘤细胞联合血清学检测在非小细胞肺癌中的应用[J]. 山东大学学报 (医学版), 2025, 63(5): 79-85.
[6] 杜雪,李春霞,刘云霞,张涛. 基于MFPC-Cox的结直肠癌患者预后动态预测模型[J]. 山东大学学报 (医学版), 2025, 63(5): 101-110.
[7] 徐年兴,魏东,乔俊杰,战炳炎. CD8+、IL-6和PaO2对不可切除ⅢB/C和Ⅳ期非小细胞肺癌免疫治疗触发放射召回性肺炎的预测价值[J]. 山东大学学报 (医学版), 2025, 63(2): 29-35.
[8] 山东省腔镜外科质量控制中心胸腔镜委员会. 山东省VATS/RATS非小细胞肺癌围术期质量控制指标专家共识(2025版)[J]. 山东大学学报 (医学版), 2025, 63(12): 1-5.
[9] 李懿原,马珊,李爱华,龙飞. 超声支气管镜引导下碘-125粒子植入治疗肺癌中央区淋巴结转移[J]. 山东大学学报 (医学版), 2025, 63(12): 26-34.
[10] 刘振昆,吕纪玲,徐伟伟,马力天,张才擎. BALF tNGS检测及培养对NSCLC合并IPFD的诊断价值[J]. 山东大学学报 (医学版), 2025, 63(11): 36-45.
[11] 张荣雨,赵文,李洪欣,杨闯,王健,韩春燕,李际盛. 奥西替尼联合化疗一线治疗EGFR-RAD51融合突变转移性肺腺癌1例[J]. 山东大学学报 (医学版), 2024, 62(5): 116-120.
[12] 张伯韬,仉率杰,孙爽爽,袁莹,胡锡峰,贾晓峰,于媛媛,薛付忠. 基于贝叶斯网络的缺血性脑卒中筛查模型构建[J]. 山东大学学报 (医学版), 2024, 62(11): 73-84.
[13] 王亚楠,梁传杰,贾梦琪,苑辉卿. 二硫键异构酶TMX1促进肺癌化疗耐药[J]. 山东大学学报 (医学版), 2023, 61(8): 31-39.
[14] 王蕾,向淇,刘学伍. 伴系统性硬化症、类风湿关节炎的副肿瘤神经综合征1例[J]. 山东大学学报 (医学版), 2023, 61(7): 118-120.
[15] 程传龙,韩闯,房启迪,刘盈,杨淑霞,崔峰,刘靖靖,李秀君. 基于时空地理加权回归模型探索肺癌发病的环境影响因素[J]. 山东大学学报 (医学版), 2023, 61(4): 95-102.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!