您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(医学版)》

山东大学学报 (医学版) ›› 2025, Vol. 63 ›› Issue (8): 61-68.doi: 10.6040/j.issn.1671-7554.0.2025.0152

• 临床研究 • 上一篇    

基于BERT和图注意力网络的医疗文本因果关系抽取算法

刘位龙1,王玎2,赵超3,王宁2,张旭1,苏萍2,宋书典2,张娜2,迟蔚蔚2   

  1. 1.山东财经大学管理科学与工程学院, 山东 济南 250014;2.山东健康医疗大数据管理中心, 山东 济南 250002;3.潍坊市卫生健康委员会, 山东 潍坊 261071
  • 发布日期:2025-08-25
  • 通讯作者: 迟蔚蔚. E-mail:nahdyw@shandong.cn
  • 基金资助:
    潍坊市中央财政支持公立医院改革与高质量发展示范项目(ZFCG-2024-0000505)

Causality extraction algorithm of medical text based on BERT and graph attention network

LIU Weilong1, WANG Ding2, ZHAO Chao3, WANG Ning2, ZHANG Xu1, SU Ping2, SONG Shudian2, ZHANG Na2, CHI Weiwei2   

  1. 1. School of Management Science and Engineering, Shandong University of Finance and Economics, Jinan 250014, Shandong, China;
    2. National Administration of Health Data, Jinan 250002, Shandong, China;
    3. Weifang Municipal Health Commission, Weifang 261071, Shandong, China
  • Published:2025-08-25

摘要: 目的 提出一种能够有效抽取因果关系的算法,以提高医疗领域文本处理的准确性。 方法 提出基于Transformer的双向编码器(bidirectional encoder representations from Transformers, BERT)和因果图注意力网络(causal graph attention networks, CGAT)的BERT-CGAT算法。首先构建因果关系图,利用医疗文本对BERT模型进行微调,以获得优化的实体嵌入表示;随后通过知识融合通道整合文本编码信息与因果结构,输入至图注意力网络;采用多头注意力机制并行处理不同子空间信息,增强复杂语义关系捕捉能力;最后通过双通道解码层实现实体及因果关系的同步抽取。 结果 在自建的糖尿病因果实体数据集上的实验表明,模型在准确率(99.74%)与召回率(81.04%)上较传统BiLSTM-CRF基线提升0.65%和16.73%,F1分数达80.83%。 结论 BERT-CGAT算法通过结合BERT的语义特征提取能力和图神经网络的关系建模优势,有效提升了医疗文本因果关系抽取的准确性,验证了该方法的有效性。

关键词: 医疗文本, BERT模型, 图注意力网络, 因果关系抽取

Abstract: Objective To propose an algorithm capable of effectively extracting causal relationships to improve the accuracy of medical text processing. Methods The study proposed a bidirectional encoder representations from Transformers(BERT)-causal graph attention networks(CGAT)algorithm based on BERT and graph attention network. First, a causal relationship graph was constructed, and the BERT model was fine-tuned on medical texts to obtain optimized entity embeddings. Subsequently, a knowledge fusion channel integrated textual encoding information with causal structures, which were then fed into the graph attention network. A multi-head attention mechanism was employed to process information from different subspaces in parallel, enhancing the ability to capture complex semantic relationships. Finally, a dual-channel decoding layer was adopted to simultaneously extract entities and their causal relationships. Results Experiments on the self-built diabetes causal entity dataset showed that the model employing the BERT-CGAT algorithm had an improvement of 0.65% and 16.73% in precision rate(99.74%)and recall rate(81.04%)compared with the traditional BiLSTM-CRF baseline, and the F1 value were 80.83%. Conclusion The BERT-CGAT algorithm effectively enhances the accuracy of causal relationship extraction from medical texts by combining BERTs semantic feature extraction capability with the relational modeling advantages of graph neural networks, thereby validating the efficacy of the proposed method.

Key words: Medical text, BERT model, Graph attention network, Causality extraction

中图分类号: 

  • TP391.1
[1] Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network for modelling sentences[C] //Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers). Baltimore, Maryland: Association for Computational Linguistics, 2014: 655-665.
[2] Devlin J, Chang MW, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, 2019: 4171-4186.
[3] Scarselli F, Gori M, Tsoi AC, et al. The graph neural network model[J]. IEEE Trans Neural Netw, 2008, 20(1): 61-80.
[4] Chang D, Chen M, Liu C, et al. DiaKG: an annotated diabetes dataset for medical knowledge graph construction[C] //Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction. Singapore: Springer, 2021: 308-314.
[5] OpenAI. ChatGPT(v4)[EB/OL].(2024-05)[2025-02-01]. https://openai.com/chatgpt
[6] Strubell E, Verga P, Belanger D, et al. Fast and accurate entity recognition with iterated dilated convolutions[EB/OL].(2017-07-22)[2025-02-01]. https://arxiv.org/abs/1702.02098
[7] Che W, Li Z, Liu T. LTP: a chinese language technology platform[C] // Coling 2010: demonstrations. Beijing, China: Coling 2010 Organizing Committee, 2010: 13-16.
[8] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[EB/OL].(2017-12-06)[2025-02-01]. https://arxiv.org/abs/1706.03762v5
[9] Gopalan S, Devi SL. Cause and effect extraction from biomedical corpus[J]. CyS, 2018, 21(4). doi: 10.13053/cys-21-4-2854
[10] Guo Y, Wang ZH, Shao ZQ. Improving causality induction with category learning[J]. Sci World J, 2014, 2014: 650147. doi:10.1155/2014/650147
[11] Kabir MA, Almulhim A, Luo X, et al. Informative causality extraction from medical literature via dependency-tree-based patterns[J]. J Healthc Inform Res, 2022, 6(3): 295-316.
[12] Radinsky K, Davidovich S, Markovitch S. Learning causality for news events prediction[C] //Annual Conference on World Wide Web. Lyon, France: CS Department Technion-lsrael Institute of Technology Haifa, Israel. 2012: 909-918.
[13] Spirtes P, Glymour C. An algorithm for fast recovery of sparse causal graphs[J]. Soc Sci Comput Rev, 1991, 9(1): 62-72.
[14] Peters J, Mooij JM, Janzing D, et al. Causal discovery with continuous additive noise models[J]. J Mach Learn Res, 2014, 15(58): 2009-2053.
[15] Zhao BX, Wang SL, Chi LH, et al. HANM: hierarchical additive noise model for many-to-one causality discovery[J]. IEEE Trans Knowl Data Eng, 2023, 35(12): 12708-12720.
[16] Gu JX, Wang ZH, Kuen J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognit, 2018, 77: 354-377. doi:10.1016/j.patcog.2017.10.013
[17] Socher R, Lin CC, Manning C, et al. Parsing natural scenes and natural language with recursive neural networks[EB/OL].(2024-04-09)[2025-02-01]. https://nlp.stanford.edu/pubs/SocherLinNgManning_ICML2011.pdf
[18] Luong MT, Socher R, Manning CD. Better word representations with recursive neural networks for morphology[EB/OL].(2024-04-09)[2025-02-01]. https://aclanthology.org/W13-3512.pdf
[19] He DC, Zhang HJ, Hao WN, et al. Distant supervised relation extraction via long short term memory networks with sentence embedding[J]. Intell Data Anal, 21(5): 1213-1231.
[20] Zheng SC, Xu JM, Zhou P, et al. A neural network framework for relation extraction: learning entity semantic and relation pattern[J]. Knowl Based Syst, 2016, 114: 12-23. doi:10.1016/j.knosys.2016.09.019
[21] Gori M, Monfardini G, Scarselli F. A new model for learning in graph domains[EB/OL].(2024-08-06)[2025-02-01]. https://ieeexplore.ieee.org/document/1555942
[22] Li Y, Tarlow D, Brockschmidt M, et al. Gated Graph Sequence Neural Networks[EB/OL].(2024-08-06)[2025-02-01]. https://www.semanticscholar.org/paper/492f57ee9ceb61fb5a47ad7aebfec1121887a175
[23] Bruna J, Zaremba W, Szlam A, et al. Spectral networks and locally connected networks on graphs[EB/OL].(2014-05-21)[2025-02-01]. http://www.xueshufan.com/publication/1662382123
[24] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering[C] //Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2016: 3844-3852.
[25] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks[EB/OL].(2017-02-22)[2025-06-16]. http://arxiv.org/abs/1609.02907
[26] Yao L, Mao C, Luo Y. Graph convolutional networks for text classification[C] //Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. Honolulu, Hawaii, USA: AAAI Press, 2019: 7370-7377.
[27] Veyseh APB, Nguyen TN, Nguyen TH. Graph transformer networks with syntactic and semantic structures for event argument extraction[EB/OL] //(2020-10-26)[2025-02-01]. https://arxiv.org/abs/2010.13391
[28] Christopoulou F, Miwa M, Ananiadou S. Connecting the dots: document-level neural relation extraction with edge-oriented graphs[C] // Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, 2019: 4925-4936.
[29] Phu MT, Nguyen TH. Graph convolutional networks for event causality identification with rich document-level structures[C]. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Bangkok, Thailand: Association for Computational Linguistics, 2021: 3480-3490.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!