Local-to-Global Causal Reasoning for Cross-Document Relation Extraction

Haoran Wu; Xiuyi Chen; Zefa Hu; Jing Shi; Shuang Xu; Bo Xu

doi:10.1109/JAS.2023.123540

Volume 10 Issue 7

Jul. 2023

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2023 > 10(7): 1608-1621

H. R. Wu, X. Y. Chen, Z. F. Hu, J. Shi, S. Xu, and B. Xu, “Local-to-global causal reasoning for cross-document relation extraction,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 7, pp. 1608–1621, Jul. 2023. doi: 10.1109/JAS.2023.123540

Citation:

H. R. Wu, X. Y. Chen, Z. F. Hu, J. Shi, S. Xu, and B. Xu, “Local-to-global causal reasoning for cross-document relation extraction,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 7, pp. 1608–1621, Jul. 2023. doi: 10.1109/JAS.2023.123540

Citation:

PDF( 1442 KB)

Local-to-Global Causal Reasoning for Cross-Document Relation Extraction

doi: 10.1109/JAS.2023.123540

Funds: This work was supported in part by the National Key Research and Development Program of China (2022ZD0116405), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA27030300), and the Key Research Program of the Chinese Academy of Sciences (ZDBS-SSW-JSC006)

More Information

Author Bio:
Haoran Wu received the B.E. degree in automation major from North China Electric Power University in 2018. He is currently the doctoral students in the Pattern Recognition and Intelligent System at Institute of Automation, Chinese Academy of Sciences, advised by Prof. Bo Xu. His main research interests include natural language processing, cognitive reasoning and human-AI hybrid intelligence

Xiuyi Chen received the Ph.D. degree (2022) in pattern recognition and intelligent system from Institute of Automation, Chinese Academy of Sciences, advised by Prof. Bo Xu. Previously, he received the B.Sc. degree (2017) in Department of Control Science and Engineering from Jilin University. His current interests include dialogue system, multimodal learning and speech & language

Zefa Hu received the B.S. degree in automation from Huazhong Agricultural University in 2018. He is currently a Ph.D. candidate at School of Artificial Intelligence, University of Chinese Academy of Sciences, and studying at Institute of Automation, Chinese Academy of Sciences

Jin Shi is a Research Assistant in the Institute of Automation, Chinese Academy of Sciences, where he received the Ph.D. degree (2021) in the major of pattern recognition and intelligent system, advised by Prof. Bo Xu. Previously, he received the B.Sc. degree (2012) in School of Instrumentation and Optoelectronic Engineering from Beihang University. His current interests include cross-modal modeling, multimodal learning, dialogue system, speech recognition and speech separation

Shuang Xu is a Professor in Institute of Automation, Chinese Academy of Science. Her main research interests include natural language processing and understanding, human-AI hybrid intelligence

Bo Xu is a Professor, the Director of the Institute of Automation Chinese Academy of Sciences, and also Deputy Director of the Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences. His main research interests include brain-inspired intelligence, brain-inspired cognitive models, natural language processing and understanding, brain-inspired robotics
Corresponding author: Haoran Wu, e-mail: wuhaoran2018@ia.ac.cn
1¹ We set the ITE of the aggregation node to 1.
Received Date: 2022-10-20
Revised Date: 2022-11-18
Accepted Date: 2023-03-11

Available Online: 2023-05-04

Abstract

Abstract

Cross-document relation extraction (RE), as an extension of information extraction, requires integrating information from multiple documents retrieved from open domains with a large number of irrelevant or confusing noisy texts. Previous studies focus on the attention mechanism to construct the connection between different text features through semantic similarity. However, similarity-based methods cannot distinguish valid information from highly similar retrieved documents well. How to design an effective algorithm to implement aggregated reasoning in confusing information with similar features still remains an open issue. To address this problem, we design a novel local-to-global causal reasoning (LGCR) network for cross-document RE, which enables efficient distinguishing, filtering and global reasoning on complex information from a causal perspective. Specifically, we propose a local causal estimation algorithm to estimate the causal effect, which is the first trial to use the causal reasoning independent of feature similarity to distinguish between confusing and valid information in cross-document RE. Furthermore, based on the causal effect, we propose a causality guided global reasoning algorithm to filter the confusing information and achieve global reasoning. Experimental results under the closed and the open settings of the large-scale dataset CodRED demonstrate our LGCR network significantly outperforms the state-of-the-art methods and validate the effectiveness of causal reasoning in confusing information processing.
- Causal reasoning,
- cross document,
- graph reasoning,
- relation extraction (RE)

FullText(HTML)

¹ We set the ITE of the aggregation node to 1.

References(63)

References

[1]	B. Distiawan Trisedya, G. Weikum, J. Z. Qi, and R. Zhang, “Neural relation extraction for knowledge base enrichment,” in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 229–240.
[2]	M. Yu, W. P. Yin, K. S. Hasan, C. dos Santos, B. Xiang, and B. W. Zhou, “Improved neural relation detection for knowledge base question answering,” in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 571–581.
[3]	H. Z. Yu, H. S. Li, D. H. Mao, and Q. Cai, “A relationship extraction method for domain knowledge graph construction,” World Wide Web, vol. 23, no. 2, pp. 735–753, Mar. 2020. doi: 10.1007/s11280-019-00765-y
[4]	R. Socher, B. Huval, C. D. Manning, and A. Y. Ng, “Semantic compositionality through recursive matrix-vector spaces,” in Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea, 2012, pp. 1201–1211.
[5]	Y. K. Lin, S. Q. Shen, Z. Y. Liu, H. B. Luan, and M. S. Sun, “Neural relation extraction with selective attention over instances,” in Proc. 54th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 2124–2133.
[6]	P. D. Qin, W. R. Xu, and W. Y. Wang, “Robust distant supervision relation extraction via deep reinforcement learning,” in Proc. 56th Annu. Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 2137–2147.
[7]	J. Li, Y. P. Sun, R. J. Johnson, D. Sciaky, C.-H. Wei, R. Leaman, A. P. Davis, C. J. Mattingly, T. C. Wiegers, and Z. Y. Lu, “BioCreative V CDR task corpus: A resource for chemical disease relation extraction,” Database, vol. 2016, pp. baw068, May 2016.
[8]	Y. Yao, D. M. Ye, P. Li, X. Han, Y. K. Lin, Z. H. Liu, Z. Y. Liu, L. X. Huang, J. Zhou, and M. S. Sun, “DocRED: A large-scale document-level relation extraction dataset,” in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 764–777.
[9]	W. X. Zhou, K. Huang, T. Y. Ma, and J. Huang, “Document-level relation extraction with adaptive thresholding and localized context pooling,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 14612–14620.
[10]	D. Vrandečić and M. Krötzsch, “Wikidata: A free collaborative knowledgebase,” Commun. ACM, vol. 57, no. 10, pp. 78–85, Oct. 2014. doi: 10.1145/2629489
[11]	Y. Yao, J. J. Du, Y. K. Lin, P. Li, Z. Y. Liu, J. Zhou, and M. S. Sun, “CodRED: A cross-document relation extraction dataset for acquiring knowledge in the wild,” in Proc. Conf. Empirical Methods in Natural Language Processing, 2021, pp. 4452–4472.
[12]	G. S. Nan, Z. J. Guo, I. Sekulic, and W. Lu, “Reasoning with latent structure refinement for document-level relation extraction,” in Proc. 58th Annu. Meeting of the Association for Computational Linguistics, 2020, pp. 1546–1557.
[13]	B. Li, W. Ye, Z. H. Sheng, R. Xie, X. Y. Xi, and S. K. Zhang, “Graph enhanced dual attention network for document-level relation extraction,” in Proc. 28th Int. Conf. Computational Linguistics, 2020, pp. 1551–1560.
[14]	S. Zeng, R. X. Xu, B. B. Chang, and L. Li, “Double graph based reasoning for document-level relation extraction,” in Proc. Conf. Empirical Methods in Natural Language Processing, 2020, pp. 1630–1640.
[15]	W. Xu, K. H. Chen, and T. J. Zhao, “Document-level relation extraction with reconstruction,” in Proc. 35th AAAI Conf. Artificial Intelligence, 2021, pp. 14167–14175.
[16]	H. R. Wu, W. Chen, S. Xu, and B. Xu, “Counterfactual supporting facts extraction for explainable medical record based diagnosis with graph network,” in Proc. Conf. North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 1942–1955.
[17]	Q. T. Wu, H. R. Zhang, J. C. Yan, and D. Wipf, “Handling distribution shifts on graphs: An invariance perspective,” in Proc. 10th Int. Conf. Learning Representations, 2021.
[18]	C. Peng and S. Athey, “Stable learning establishes some common ground between causal inference and machine learning,” Nat. Mach. Intell., vol. 4, no. 2, pp. 110–115, Feb. 2022. doi: 10.1038/s42256-022-00445-z
[19]	J. Pearl, Causality. 2nd ed. Cambridge, UK: Cambridge University Press, 2009.
[20]	D. J. Zeng, K. Liu, S. W. Lai, G. Y. Zhou, and J. Zhao, “Relation classification via convolutional deep neural network,” in Proc. 25th Int. Conf. Computational Linguistics: Tech. Paper, Dublin, Ireland, 2014, pp. 2335–2344.
[21]	L. L. Wang, Z. Cao, G. de Melo, and Z. Y. Liu, “Relation classification via multi-level attention CNNs,” in Proc. 54th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1298–1307.
[22]	J. Feng, M. l. Huang, L. Zhao, Y. Yang, and X. Y. Zhu, “Reinforcement learning for relation classification from noisy data,” in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, USA, 2018, pp. 5779–5786.
[23]	N. Y. Zhang, S. M. Deng, Z. L. Sun, J. Y. Chen, W. Zhang, and H. J. Chen, “Relation adversarial network for low resource knowledge graph completion,” in Proc. Web Conf., Taipei, China, 2020, pp. 1–12.
[24]	H. Wang, C. Focke, R. Sylvester, N. Mishra, and W. Wang, “Fine-tune Bert for docRED with two-step process,” arXiv preprint arXiv: 1909.11898, 2019.
[25]	H. Z. Tang, Y. N. Cao, Z. Y. Zhang, J. X. Cao, F. Fang, S. Wang, and P. F. Yin, “HIN: Hierarchical inference network for document-level relation extraction,” in Proc. 24th Pacific-Asia Conf. Knowledge Discovery and Data Mining, Singapore, 2020, pp. 197–209.
[26]	S. Zeng, Y. T. Wu, and B. B. Chang, “SIRE: Separate intra- and inter-sentential reasoning for document-level relation extraction,” in Proc. Findings of the Association for Computational Linguistics, 2021, pp. 524–534.
[27]	Q. Z. Huang, S. Q. Zhu, Y. S. Feng, Y. Ye, Y. X. Lai, and D. Y. Zhao, “Three sentences are all you need: Local path enhanced document relation extraction,” in Proc. 59th Annu. Meeting of the Association for Computational Linguistics and the 11th Int. Joint Conf. Natural Language Processing, 2021, pp. 998–1004.
[28]	Z. Q. Wang, S. C. Gao, M. C. Zhou, S. Sato, J. J. Cheng, and J. H. Wang, “Information-theory-based nondominated sorting ant colony optimization for multiobjective feature selection in classification,” IEEE Trans. Cybern., 2022, doi: 10.1109/TCYB.2022.3185554.
[29]	S. Ramírez-Gallego, H. Mouriño-Talín, D. Martínez-Rego, V. Bolón-Canedo, J. M. Benítez, A. Alonso-Betanzos, and F. Herrera, “An information theory-based feature selection framework for big data under apache spark,” IEEE Trans. Syst. Man Cybern. Syst., vol. 48, no. 9, pp. 1441–1453, Sept. 2018. doi: 10.1109/TSMC.2017.2670926
[30]	R. Q. Lu, X. L. Jin, S. M. Zhang, M. K. Qiu, and X. D. Wu, “A study on big knowledge and its engineering issues,” IEEE Trans. Knowl. Data Eng., vol. 31, no. 9, pp. 1630–1644, Sept. 2019. doi: 10.1109/TKDE.2018.2866863
[31]	S. Athey, “Beyond prediction: Using big data for policy problems,” Science, vol. 355, no. 6324, pp. 483–485, Feb. 2017. doi: 10.1126/science.aal4321
[32]	D. Wu, Y. He, X. Luo, and M. C. Zhou, “A latent factor analysis-based approach to online sparse streaming feature selection,” IEEE Trans. Syst. Man Cybern. Syst., vol. 52, no. 11, pp. 6744–6758, Nov. 2022. doi: 10.1109/TSMC.2021.3096065
[33]	H. Wu, X. Luo, and M. C. Zhou, “Advancing non-negative latent factorization of tensors with diversified regularization schemes,” IEEE Trans. Serv. Comput., vol. 15, no. 3, pp. 1334–1344, Jun. 2022. doi: 10.1109/TSC.2020.2988760
[34]	D. Wu, M. S. Shang, X. Luo, and Z. D. Wang, “An L₁-and-L₂-norm-oriented latent factor model for recommender systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 10, pp. 5775–5788, Oct. 2022. doi: 10.1109/TNNLS.2021.3071392
[35]	L. Hu, S. C. Yang, X. Luo, and M. C. Zhou, “An algorithm of inductively identifying clusters from attributed graphs,” IEEE Trans. Big Data, vol. 8, no. 2, pp. 523–534, Apr. 2022.
[36]	A. Shrikumar, P. Greenside, and A. Kundaje, “Learning important features through propagating activation differences,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 3145–3153.
[37]	R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika, vol. 70, no. 1, pp. 41–55, Apr. 1983. doi: 10.1093/biomet/70.1.41
[38]	Z. Y. Shen, P. Cui, K. Kuang, B. Li, and P. X. Chen, “Causally regularized learning with agnostic data selection bias,” in Proc. 26th ACM Int. Conf. Multimedia, Seoul, Republic of Korea, 2018, pp. 411–419.
[39]	K. Kuang, P. Cui, S. Athey, R. X. Xiong, and B. Li, “Stable prediction across unknown environments,” in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 1617–1626.
[40]	D. Xu, C. W. Ruan, E. Korpeoglu, S. Kumar, and K. Achan, “Adversarial counterfactual learning and evaluation for recommender system,” in Proc. 34th Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 13515–13526.
[41]	M. Oberst and D. Sontag, “Counterfactual off-policy evaluation with gumbel-max structural causal models,” in Proc. 36th Int. Conf. Machine Learning, Long Beach, USA, 2019, pp. 4881–4890.
[42]	D. Kaushik, E. H. Hovy, and Z. C. Lipton, “Learning the difference that makes a difference with counterfactually-augmented data,” in Proc. 8th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2019.
[43]	T.-J. Fu, X. E. Wang, M. F. Peterson, S. T. Grafton, M. P. Eckstein, and W. Y. Wang, “Counterfactual vision-and-language navigation via adversarial path sampler,” in Proc. 16th European Conf. Computer Vision, Glasgow, UK, 2020, pp. 71–86.
[44]	Q. F. Zhu, W.-N. Zhang, T. Liu, and W. Y. Wang, “Counterfactual off-policy training for neural dialogue generation,” in Proc. Conf. Empirical Methods in Natural Language Processing, 2020, pp. 3438–3448.
[45]	J. C. Weiss, F. Kuusisto, K. Boyd, J. Liu, and D. Page, “Machine learning for treatment assignment: Improving individualized risk attribution,” in Proc. AMIA, San Francisco, USA, 2015, pp. 1306.
[46]	T. Zhao, G. Liu, D. H. Wang, W. H. Yu, and M. Jiang, “Learning from counterfactual links for link prediction,” in Proc. 39th Int. Conf. Machine Learning, Baltimore, USA, 2022, pp. 26911–26926.
[47]	J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. Conf. North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, 2019, pp. 4171–4186.
[48]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 6000–6010.
[49]	C. J. Maddison, A. Mnih, and Y. W. Teh, “The concrete distribution: A continuous relaxation of discrete random variables,” in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2017.
[50]	E. Jang, S. X. Gu, and B. Poole, “Categorical reparameterization with gumbel-softmax,” in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2017.
[51]	E. J. Gumbel, Statistical Theory of Extreme Values and Some Practical Applications: A Series of Lectures. Washington, USA: US Government Printing Office, 1954.
[52]	D. Alvarez-Melis and T. S. Jaakkola, “Towards robust interpretability with self-explaining neural networks,” in Proc. 32nd Int. Conf. Neural Information Processing Systems, Red Hook, USA, 2018. pp. 7786–7795.
[53]	Y. F. Sun, C. M. Cheng, Y. H. Zhang, C. Zhang, L. Zheng, Z. D. Wang, and Y. C. Wei, “Circle loss: A unified perspective of pair similarity optimization,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, USA, 2020, pp. 6397–6406.
[54]	D. M. Ye, Y. K. Lin, J. J. Du, Z. H. Liu, P. Li, M. S. Sun, and Z. Y. Liu, “Coreferential reasoning learning for language representation,” in Proc. Conf. Empirical Methods in Natural Language Processing, 2020, pp. 7170–7186.
[55]	G. D. Duan, Y. R. Dong, J. Y. Miao, and T. X. Huang, “Position-aware attention mechanism-based bi-graph for dialogue relation extraction,” Cognit. Comput., vol. 15, no. 1, pp. 359–372, Jan. 2023. doi: 10.1007/s12559-022-10105-4
[56]	F. Q. Wang, F. Li, H. Fei, J. Y. Li, S. Q. Wu, F. F. Su, W. X. Shi, D. H. Ji, and B. Cai, “Entity-centered cross-document relation extraction,” in Proc. Conf. Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 2022, pp. 9871–9881.
[57]	T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. 5th Int. Conf. Learning Representations, Toulon, France, 2017.
[58]	P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in Proc. 6th Int. Conf. Learning Representations, Vancouver, Canada, 2018.
[59]	W. L. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, USA, 2017, pp. 1025–1035.
[60]	J. B. Chen, L. Song, M. Wainwright, and M. Jordan, “Learning to explain: An information-theoretic perspective on model interpretation,” in Proc. 35th Int. Conf. Machine Learning. Stockholm, Sweden, 2018, pp. 883–892.
[61]	J. Liang, B. Bai, Y. R. Cao, K. Bai, and F. Wang, “Adversarial infidelity learning for model interpretation,” in Proc. 26th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2020, pp. 286–296.
[62]	X. Wang, Y.-X. Wu, A. Zhang, X. N. He, and T.-S. Chua, “Towards multi-grained explainability for graph neural networks,” in Proc. 35th Conf. Neural Information Processing Systems, 2021, pp. 18446–18458.
[63]	S. Jain, S. Wiegreffe, Y. Pinter, and B. C. Wallace, “Learning to faithfully rationalize by construction,” in Proc. 58th Annu. Meeting of the Association for Computational Linguistics, 2020, pp. 4459–4473.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(9) / Tables(8)

Get Citation

PDF

XML

Article Metrics

Article views (624) PDF downloads(119)

Highlights

Realizing the filtering of confusing information from the causal perspective
End-to-end causal reasoning algorithm to estimate the contribution of features
Controlling message propagation in graph convolution using causality
Verifying the mutual enhancement between causal reasoning and downstream tasks
Achieving state-of-the-art performance on cross-document relation extraction

Local-to-Global Causal Reasoning for Cross-Document Relation Extraction

doi: 10.1109/JAS.2023.123540

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content