您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(医学版)》

山东大学学报 (医学版) ›› 2021, Vol. 59 ›› Issue (1): 64-71.doi: 10.6040/j.issn.1671-7554.0.2020.1033

• 临床医学 • 上一篇    下一篇

基于TCGA数据库预测结肠癌预后基因及其临床应用价值

甄秋来1,2,吕欣然3,叶辉1,丁绪超3,柴小雪1,胡辛1,周明1,曹莉莉1,3   

  1. 1. 山东大学附属山东省千佛山医院肿瘤科, 山东省风湿免疫病转化医学重点实验室, 山东 济南 250014;2. 淄博矿业集团有限责任公司中心医院输血科, 山东 淄博 255120;3. 山东第一医科大学第一附属医院肿瘤科, 山东 济南 250014
  • 发布日期:2021-01-09
  • 通讯作者: 曹莉莉. E-mail:cll@sdu.edu.cn
  • 基金资助:
    山东省重点研发计划(2019GSF108180);济南市科技发展计划(201907119);山东省中医药科技发展计划(2019-0378);山东第一医科大学第一附属医院培育基金(QYPY2019NSFC1015)

Predicting colon cancer prognosis genes and clinical application value based on TCGA database

ZHEN Qiulai1,2, LYU Xinran3, YE Hui1, DING Xuchao3, CHAI Xiaoxue1, HU Xin1, ZHOU Ming1, CAO Lili1,3   

  1. 1. Department of Oncology, Shandong Qianfoshan Hospital, Cheeloo College of Medicine, Shandong University, Shandong Provincial Key Laboratory for Rheumatic Disease and Translational Medicine, Jinan 250014, Shandong, China;
    2. Department of Blood Transfusion, Central Hospital of Zibo Mining Group Co., Ltd., Zibo 255120, Shandong, China;
    3. Department of Oncology, The First Affiliated Hospital of Shandong First Medical University, Jinan 250014, Shandong, China
  • Published:2021-01-09

摘要: 目的 利用生物分析学方法对癌症基因组图谱(TCGA)数据库的结肠癌数据进行挖掘分析,筛选预后基因,识别结肠癌患者死亡的高低风险,并预测其预后。 方法 访问TCGA并下载结肠癌患者RNA表达数据和临床信息。通过单因素Cox和多因素Cox回归分析,构建比例风险回归模型并形成风险评分公式。根据风险评分中位值将患者分为高风险组和低风险组,识别结肠癌患者死亡风险。采用接收者操作特征曲线(ROC)及曲线下面积(AUC)验证该模型的评估性能。利用R语言对预后相关基因进行生存分析,并对差异基因进行GO功能和KEGG通路富集分析。 结果 结肠癌5 544个差异表达基因中,有27个基因与患者整体生存率相关。从中筛选出GABRD、FAM132B、LRRN4、RP11-400N13.2、RP11-108K3.2、RNU6-403P、RP11-429J17.8、LINC01296、RP11-190J1.3、AC002076.10和CTC-573N18.1共11个基因,构建结肠癌患者的Cox预后模型。ROC分析显示,高风险组5年期生存率为39.5%(95%CI:29.5~53.0),低风险组为89.6%(95%CI:82.2~97.7),AUC=0.827,该模型可以较好地区分高低风险的结肠癌患者。 结论 通过Cox比例风险模型基因获得风险得分并结合临床信息,用作结肠癌患者的预后及生存时间的评估。

关键词: TCGA数据库, 结肠癌, RNA, Cox比例风险模型, 生存时间

Abstract: Objective To screen the prognostic genes, identify risks and predict prognosis by excavating colon cancer data from TCGA database. Methods The RNA expression data and clinical information of colon cancer patients were downloaded from TCGA database. A proportional hazard regression model was constructed and a risk scoring formula was formed after univariate Cox and multivariate Cox regression analyses. The patients were divided into high-risk and low-risk groups based on the median risk score to determine the mortality risk. The receiver operating characteristic(ROC)curve and area under the curve(AUC)were used to verify the evaluation performance of the model. Survival analysis of prognosis-related genes was performed using R language. The differentially expressed genes were analyzed using GO function and KEGG pathway enrichment. Results Of the 5 544 differentially expressed genes, 27 were associated with overall survival, and 11 were screened to construct the prognostic model, including GABRD, FAM132B, LRRN4, RP11-400N13.2, RP11-108K3.2, RNU6-403P, RP11-429J17.8, LINC01296, RP11-190J1.3, AC002076.10 and CTC-573N18.1. ROC analysis showed that the 5-year survival rate was 39.5%(95%CI: 29.5-53.0)in the high-risk group and 89.6%(95%CI: 82.2-97.7)in the low-risk group, with AUC being0.827, indicating that the model could effectively distinguish patients with high and low risks. Conclusion The risk score obtained from the Cox proportional hazard model genes combined with clinical information can be used to evaluate the prognosis and survival of patients with colon cancer.

Key words: TCGA database, Colon cancer, RNA, Cox proportional hazard model, Survival

中图分类号: 

  • R574.62
[1] Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer J Clin, 2018, 68(6): 394-424.
[2] Banskota S, Dahal S, Kwon E, et al. Β-Catenin gene promoter hypermethylation by reactive oxygen species correlates with the migratory and invasive potentials of colon cancer cells [J]. Cell Oncol(Dordr), 2018, 41(5): 569-580.
[3] Park SY, Wilkens LR, Setiawan VW, et al. Alcohol intake and colorectal cancer risk in the multiethnic cohort study [J]. Am J Epidemiol, 2019, 188(1): 67-76.
[4] Vanella G, Archibugi L, Stigliano S, et al. Alcohol and gastrointestinal cancers [J]. Curr Opin Gastroenterol, 2019, 35(2): 107-113.
[5] Fagunwa IO, Loughrey MB, Coleman HG. Alcohol, smoking and the risk of premalignant and malignant colorectal neoplasms [J]. Best Pract Res Clin Gastroenterol, 2017, 31(5): 561-568.
[6] Zhang J, Guo S, Li J, et al. Effects of high-fat diet-induced adipokines and cytokines on colorectal cancer development [J]. FEBS Open Bio, 2019, 9(12): 2117-2125.
[7] Triff K, McLean MW, Callaway E, et al. Dietary fat and fiber interact to uniquely modify global histone post-translational epigenetic programming in a rat colon cancer progression model [J]. Int J Cancer, 2018, 143(6): 1402-1415.
[8] Nadella S, Burks J, Al-Sabban A, et al. Dietary fat stimulates pancreatic cancer growth and promotes fibrosis of the tumor microenvironment through the cholecystokinin receptor [J]. Am J Physiol - Gastrointest Liver Physiol, 2018, 315(5): G699-G712.
[9] Bhatlekar S, Fields JZ, Boman BM. HOX genes and their role in the development of human cancers [J]. J Mol Med, 2014, 92(8): 811-823.
[10] Hong SN. Genetic and epigenetic alterations of colorectal cancer [J]. Intest Res, 2018, 16(3): 327.
[11] Mc Donald RA, Hata A, MacLean MR, et al. Micro RNA and vascular remodelling in acute vascular injury and pulmonary vascular remodelling [J]. Cardiovasc Res, 2012, 93(4): 594-604.
[12] Davis-Dusenbery BN, Wu C, Hata A. Micromanaging vascular smooth muscle cell differentiation and phenotypic modulation [J]. Arterioscler Thromb Vasc Biol, 2011, 31(11): 2370-2377.
[13] Xu G, Zhang M, Zhu H, et al. A 15-gene signature for prediction of colon cancer recurrence and prognosis based on SVM [J]. Gene, 2017, 604: 33-40. doi: 10.1016/j.gene.2016.12.016.
[14] Sun D, Chen J, Liu L, et al. Establishment of a 12-gene expression signature to predict colon cancer prognosis [J]. PeerJ, 2018, 6: e4942. doi:10.7717/peerj.4942.
[15] Zuo SG, Dai GP, Ren XQ. Identification of a 6-gene signature predicting prognosis for colorectal cancer [J]. Cancer Cell Int, 2019, 19: 6. doi: 10.1186/s12935-018-0724-7.
[16] Wang Z, Jensen MA, Zenklusen JC. A practical guide to the cancer genome atlas(TCGA)[J]. Methods Mol Biol, 2016, 1418: 111-141. doi: 10.1007/978-1-4939-3578-9_6.
[17] Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources [J]. Nat Protoc, 2009, 4(1): 44-57.
[18] Wong MCS, Huang JJ, Lok V, et al. Differences in incidence and mortality trends of colorectal cancer worldwide based on sex, age, and anatomic location [J]. Clin Gastroenterol Hepatol, 2020(20): 30196-30198. doi: 10.1016/j.cgh.2020.02.026.
[19] Araghi M, Soerjomataram I, Jenkins M, et al. Global trends in colorectal cancer mortality: projections to the year 2035 [J]. Int J Cancer, 2019, 144(12): 2992-3000.
[20] Kim EK, Song MJ, Jung Y, et al. Proteomic analysis of primary colon cancer and synchronous solitary liver metastasis[J]. Cancer Genomics Proteomics, 2019, 16(6): 583-592.
[21] Kumamoto K, Nakachi Y, Mizuno Y, et al. Expressions of 10 genes as candidate predictors of recurrence in stage III colon cancer patients receiving adjuvant oxaliplatinbased chemotherapy[J]. Oncol Lett, 2019, 18(2): 1388-1394.
[22] Zhao ZW, Fan XX, Yang LL, et al. The identification of a common different gene expression signature in patients with colorectal cancer [J]. Math Biosci Eng, 2019, 16(4): 2942-2958.
[23] Mo SB, Dai WX, Xiang WQ, et al. Prognostic and predictive value of an autophagy-related signature for early relapse in stages I-III colon cancer [J]. Carcinogenesis, 2019, 40(7): 861-870.
[24] Wang XJ, Zeng B, Lin S, et al. An integrated miRNA-lncRNA signature predicts the survival of stage II colon cancer [J]. Ann Clin Lab Sci, 2019, 49(6): 730-739.
[25] Yang H, Liu H, Lin HC, et al. Association of a novel seven-gene expression signature with the disease prognosis in colon cancer patients [J]. Aging(Albany NY), 2019, 11(19): 8710-8727.
[26] Lauss M, Kriegner A, Vierlinger K, et al. Characterization of the drugged human genome [J]. Pharmacogenomics, 2007, 8(8): 1063-1073.
[27] Nguyen TT, Ung TT, Kim NH, et al. Role of bile acids in colon carcinogenesis [J]. World J Clin Cases, 2018, 6(13): 577-588.
[28] Yan L, Gong YZ, Shao MN, et al. Distinct diagnostic and prognostic values of γ-aminobutyric acid type A receptor family genes in patients with colon adenocarcinoma [J]. Oncol Lett, 2020, 20(1): 275-291.
[29] Liu B, Pan SM, Xiao Y, et al. LINC01296/miR-26a/GALNT3 axis contributes to colorectal cancer progression by regulating O-glycosylated MUC1 via PI3K/AKT pathway [J]. J Exp Clin Cancer Res, 2018, 37: 316. doi:10.1186/s13046-018-0994-x.
[30] Jiang D, Jin M, Ye D, et al. Polymorphisms of a novel long non-coding RNA RP11-108K3.2 with colorectal cancer susceptibility and their effects on its expression [J]. Int J Biol Markers, 2020, 35(1): 3-9.
[31] Zhou W, Pan B, Liu L. Integrated bioinformatics analysis revealing independent prognostic long non-coding RNAs DNAH17-AS1 and RP11-400N13.2 and their potential oncogenic roles in colorectal cancer [J]. Oncol Lett, 2019, 18(4): 3705-3715.
[1] 张振伟,李佳,陈克明. IGF2BP2/m6A/ITGA5信号轴调控肾透明细胞增殖和迁移[J]. 山东大学学报 (医学版), 2022, 60(9): 74-84.
[2] 覃超群,黄斌,阳芳,王昌明,肖影,黄汉灿,李丽英,高枫. GSK3β/eEF2K信号通过调控自噬参与肺成纤维细胞诱导分化[J]. 山东大学学报 (医学版), 2022, 60(5): 8-15.
[3] 高惠茹,杜甜甜,王允山,杜鲁涛,王传新. 基于单细胞转录组测序数据分析胃癌调节性T细胞特征[J]. 山东大学学报 (医学版), 2022, 60(5): 43-49.
[4] 钟黎黎,盛莹,郭江虹,阳双健,何宜静. LncRNA-UCA1通过靶向调控miR-182-5p对滋养细胞侵袭与转移的影响[J]. 山东大学学报 (医学版), 2022, 60(3): 76-82.
[5] 冯鑫鑫,韩波,张丽,马孟洁,陈思宇. 长链非编码RNA NONHSAT247814.1在18例儿童心肌炎中的表达及体外细胞实验观察[J]. 山东大学学报 (医学版), 2022, 60(10): 27-32.
[6] 马燕燕,龚瑶琴. 人脑类器官在神经发育疾病研究中的应用[J]. 山东大学学报 (医学版), 2021, 59(9): 22-29.
[7] 褚晏,刘端瑞,朱文帅,樊荣,马晓丽,汪运山,郏雁飞. DNA甲基化转移酶在胃癌中的表达及其临床意义[J]. 山东大学学报 (医学版), 2021, 59(7): 1-9.
[8] 李皖皖,周文凯,董书晴,贺士卿,刘钊,张家新,刘斌. 利用数据库信息构建乳腺癌免疫关联lncRNAs风险评估模型[J]. 山东大学学报 (医学版), 2021, 59(7): 74-84.
[9] 米琦,史爽,李娟,李培龙,杜鲁涛,王传新. 膀胱癌circRNAs介导的ceRNA网络及预后评估模型的构建[J]. 山东大学学报 (医学版), 2021, 59(6): 94-102.
[10] 张明明,高伟,崔佳,霍丽静,胡晴川,李伟皓,王超. miR-33a在稳定型心绞痛患者血清中的表达及意义[J]. 山东大学学报 (医学版), 2021, 59(6): 64-70.
[11] 张学丽, 郑璐, 王瑜, 王康, 闫素华. 沉默LncRNA H19通过调节神经生长因子抑制心肌梗死后交感神经重构[J]. 山东大学学报 (医学版), 2021, 59(5): 73-81.
[12] 罗兵. EB病毒对胃癌表观遗传学的影响[J]. 山东大学学报 (医学版), 2021, 59(5): 30-39.
[13] 王凯. 病毒疫苗的研发现状及展望[J]. 山东大学学报 (医学版), 2021, 59(5): 8-14.
[14] 孔雪,李娟,段伟丽,史爽,李培龙,杜鲁涛,毛海婷,王传新. 长链非编码RNA AC012073.1对乳腺癌细胞迁移侵袭的影响及临床价值[J]. 山东大学学报 (医学版), 2021, 59(4): 70-78.
[15] 张小红,周云,杜秋莹,任慧欣,王超群. Atg7-siRNA通过调节精氨酸循环干扰食管癌ECA109细胞放疗敏感性[J]. 山东大学学报 (医学版), 2021, 59(4): 28-34.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 张杰,李振华,孙晋浩,暴丽华,刘岳鹏. 恒定磁场对Schwann细胞氧化损伤的保护作用[J]. 山东大学学报(医学版), 2007, 45(3): 229 -232 .
[2] 方英立,马玉燕,刘锡梅,周文 . 急诊剖宫产患者围手术期替硝唑合理应用[J]. 山东大学学报(医学版), 2007, 45(10): 995 .
[3] 姜红菊,李润智,王营,徐冬梅,张梅,张运,李继福 . 冠状动脉粥样硬化斑块形态及介入治疗与MMP-9的关系[J]. 山东大学学报(医学版), 2008, 46(10): 966 -970 .
[4] 郑敏,郝跃伟,刘雪平,赵婷婷. 血小板膜糖蛋白Ibα基因HPA-2、Kozak序列多态性与脑梗死的相关性研究[J]. 山东大学学报(医学版), 2008, 46(3): 292 -295 .
[5] 王术芹,齐 峰,吴剑波,孙宝柱. 罗哌卡因对大鼠离体主动脉收缩作用的钙离子调节机制[J]. 山东大学学报(医学版), 2008, 46(8): 773 -776 .
[6] 滕学仁,赵永生,胡光亮,周伦,李建民 . 两种方法保存同种异体髌腱移植重建膝关节交叉韧带的光镜电镜观察[J]. 山东大学学报(医学版), 2008, 46(10): 945 -950 .
[7] 焦芳芳,刘世青,李飞,李长生,王琴,孙青,鹿伟 . 化瘀理肺方对大鼠肺间质纤维化时Smad7和TGF-β表达的影响[J]. 山东大学学报(医学版), 2007, 45(10): 1054 -1058 .
[8] 赵瑛,颜磊,张辉,于鹏,李明江,赵兴波. 精子相关抗原9在卵巢浆液性上皮肿瘤中的表达[J]. 山东大学学报(医学版), 2012, 50(2): 98 .
[9] 赵鹏,毕万利,李宁 . 螺旋CT后处理技术对青少年先天性脊柱畸形的诊断价值[J]. 山东大学学报(医学版), 2007, 45(8): 825 -829 .
[10] 袁吴敏,赵志伦,王洁贞 . 吸烟和饮酒与颅内肿瘤关系的Meta分析[J]. 山东大学学报(医学版), 2006, 44(11): 1146 -1149 .