A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 10 Issue 11
Nov.  2023

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
X. H. Wang, S. S. Zhao, L. Guo, L. Zhu, C. R. Cui, and  L. C. Xu,  “GraphCA: Learning from graph counterfactual augmentation for knowledge tracing,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 11, pp. 2108–2123, Nov. 2023. doi: 10.1109/JAS.2023.123678
Citation: X. H. Wang, S. S. Zhao, L. Guo, L. Zhu, C. R. Cui, and  L. C. Xu,  “GraphCA: Learning from graph counterfactual augmentation for knowledge tracing,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 11, pp. 2108–2123, Nov. 2023. doi: 10.1109/JAS.2023.123678

GraphCA: Learning From Graph Counterfactual Augmentation for Knowledge Tracing

doi: 10.1109/JAS.2023.123678
Funds:  This work was supported by the Natural Science Foundation of China (62372277), the Natural Science Foundation of Shandong Province (ZR2022MF257, ZR2022MF295) and Humanities and Social Sciences Fund of the Ministry of Education (21YJC630157)
More Information
  • With the popularity of online learning in educational settings, knowledge tracing (KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery of knowledge based on their historical exercise sequences. Nowadays, many related works have emerged in this field, such as Bayesian knowledge tracing and deep knowledge tracing methods. Despite the progress that has been made in KT, existing techniques still have the following limitations: 1) Previous studies address KT by only exploring the observational sparsity data distribution, and the counterfactual data distribution has been largely ignored. 2) Current works designed for KT only consider either the entity relationships between questions and concepts, or the relations between two concepts, and none of them investigates the relations among students, questions, and concepts, simultaneously, leading to inaccurate student modeling. To address the above limitations, we propose a graph counterfactual augmentation method for knowledge tracing. Concretely, to consider the multiple relationships among different entities, we first uniform students, questions, and concepts in graphs, and then leverage a heterogeneous graph convolutional network to conduct representation learning. To model the counterfactual world, we conduct counterfactual transformations on students’ learning graphs by changing the corresponding treatments and then exploit the counterfactual outcomes in a contrastive learning framework. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed GraphCA method compared with several state-of-the-art baselines.

     

  • loading
  • [1]
    X. Wang, W. Ma, L. Guo, H. Jiang, F. Liu, and C. Xu, “HGNN: Hyperedge-based graph neural network for MOOC course recommendation,” Information Processing &Management, vol. 59, no. 3, p. 102938, 2022.
    [2]
    C. Cui, J. Zong, Y. Ma, X. Wang, L. Guo, M. Chen, and Y. Yin, “Tri-branch convolutional neural networks for top-k focused academic performance prediction,” IEEE Trans. Neural Networks and Learning Systems, 2022. DOI: 10.1109/TNNLS.2022.3175068
    [3]
    S. Pandey and G. Karypis, “A self-attentive model for knowledge tracing,” arXiv preprint arXiv: 1907.06837, 2019.
    [4]
    T. Wang, F. Ma, and J. Gao, “Deep hierarchical knowledge tracing,” in Proc. 12th Int. Conf. Educational Data Mining, 2019, pp. 671–674.
    [5]
    Y. Yang, J. Shen, Y. Qu, Y. Liu, K. Wang, Y. Zhu, W. Zhang, and Y. Yu, “GIKT: A graph-based interaction model for knowledge tracing,” in Proc. Joint European Conf. Machine Learning and Knowledge Discovery in Databases, 2020, pp. 299–315.
    [6]
    P. Chen, Y. Lu, V. W. Zheng, and Y. Pian, “Prerequisite-driven deep knowledge tracing,” in Proc. IEEE Int. Conf. Data Mining, 2018, pp. 39–48.
    [7]
    S. Zhang, D. Yao, Z. Zhao, T.-S. Chua, and F. Wu, “CauseRec: Counterfactual user sequence synthesis for sequential recommendation,” in Proc. 44th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, 2021, pp. 367–377.
    [8]
    L. V. DiBello, L. A. Roussos, and W. Stout, “31A review of cognitively diagnostic assessment and a summary of psychometric models,” Handbook of Statistics, vol. 26, pp. 979–1030, 2006.
    [9]
    R. J. Harvey and A. L. Hammer, “Item response theory,” The Counseling Psychologist, vol. 27, no. 3, pp. 353–383, 1999. doi: 10.1177/0011000099273004
    [10]
    B. Deonovic, M. Yudelson, M. Bolsinova, M. Attali, and G. Maris, “Learning meets assessment,” Behaviormetrika, vol. 45, no. 2, pp. 457–474, 2018. doi: 10.1007/s41237-018-0070-z
    [11]
    J. De La Torre, “Dina model and parameter estimation: A didactic,” J. Educational Behavioral Statistics, vol. 34, no. 1, pp. 115–130, 2009. doi: 10.3102/1076998607309474
    [12]
    A. T. Corbett and J. R. Anderson, “Knowledge tracing: Modeling the acquisition of procedural knowledge,” User Modeling and User-Adapted Interaction, vol. 4, no. 4, pp. 253–278, 1994.
    [13]
    R. S. d Baker, A. T. Corbett, and V. Aleven, “More accurate student modeling through contextual estimation of slip and guess probabilities in bayesian knowledge tracing,” in Proc. Int. Conf. Intelligent Tutoring Systems, 2008, pp. 406–415.
    [14]
    S. Spaulding and C. Breazeal, “Affect and inference in bayesian knowledge tracing with a robot tutor,” in Proc. 10th Annu. ACM/IEEE Int. Conf. Human-Robot Interaction Extended Abstracts, 2015, pp. 219–220.
    [15]
    Z. A. Pardos, N. Heffernan, C. Ruiz, and J. Beck, “Effective skill assessment using expectation maximization in a multi network temporal bayesian network,” in Proc. Young Researchers Track at 9th Int. Conf. Intelligent Tutoring Systems, 2008, pp. 1–14.
    [16]
    L. Guo, H. Yin, Q. Wang, T. Chen, A. Zhou, and N. Quoc Viet Hung, “Streaming session-based recommendation,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2019, pp. 1569–1577.
    [17]
    L. Guo, J. Zhang, T. Chen, X. Wang, and H. Yin, “Reinforcement learning-enhanced shared-account cross-domain sequential recommendation,” IEEE Trans. Knowledge and Data Engineering, vol. 35, no. 7, pp. 7397–7411, Jul. 2023. doi: 10.1109/TKDE.2022.3185101
    [18]
    C. Piech, J. Bassen, J. Huang, S. Ganguli, M. Sahami, L. J. Guibas, and J. Sohl-Dickstein, “Deep knowledge tracing,” Advances in Neural Information Processing Syst., vol. 1, pp. 505–513, 2015. doi: 10.5555/2969239.2969296
    [19]
    J. Zhang, X. Shi, I. King, and D.-Y. Yeung, “Dynamic key-value memory networks for knowledge tracing,” in Proc. 26th Int. Conf. World Wide Web, 2017, pp. 765–774.
    [20]
    K. Nagatani, Q. Zhang, M. Sato, Y.-Y. Chen, F. Chen, and T. Ohkuma, “Augmenting knowledge tracing by considering forgetting behavior,” in Proc. World Wide Web Conf., 2019, pp. 3101–3107.
    [21]
    G. Abdelrahman and Q. Wang, “Knowledge tracing with sequential key-value memory networks,” in Proc. 42nd Int. ACM SIGIR Conf. Research and Development in Information Retrieval, 2019, pp. 175–184.
    [22]
    Q. Liu, Z. Huang, Y. Yin, E. Chen, H. Xiong, Y. Su, and G. Hu, “EKT: Exercise-aware knowledge tracing for student performance prediction,” IEEE Trans. Knowledge and Data Engineering, vol. 33, no. 1, pp. 100–115, 2019.
    [23]
    L. Guo, L. Tang, T. Chen, L. Zhu, Q. V. H. Nguyen, and H. Yin, “DAGCN: A domain-aware attentive graph convolution network for sharedaccount cross-domain sequential recommendation,” arXiv preprint arXiv: 2105.03300, 2021.
    [24]
    L. Guo, H. Yin, T. Chen, X. Zhang, and K. Zheng, “Hierarchical hyperedge embedding-based representation learning for group recommendation,” ACM Trans. Information Systems, vol. 40, no. 1, pp. 1–27, 2021.
    [25]
    X. Wang, L. Jia, L. Guo, and F. Liu, “Multi-aspect heterogeneous information network for MOOC knowledge concept recommendation,” Applied Intelligence, vol. 53, pp. 11951–11965, 2023. doi: 10.1007/s10489-022-04025-x
    [26]
    X. Hong, T. Zhang, Z. Cui, and J. Yang, “Variational gridded graph convolution network for node classification,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 10, pp. 1697–1708, 2021. doi: 10.1109/JAS.2021.1004201
    [27]
    X. Wang, X. He, Y. Cao, M. Liu, and T.-S. Chua, “KGAT: Knowledge graph attention network for recommendation,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2019, pp. 950–958.
    [28]
    X. Qi, R. Liao, J. Jia, S. Fidler, and R. Urtasun, “3D graph neural networks for RGBD semantic segmentation,” in Proc. IEEE Int. Conf. Computer Vision, 2017, pp. 5199–5208.
    [29]
    L. Yao, C. Mao, and Y. Luo, “Graph convolutional networks for text classification,” in Proc. AAAI Conf. Artificial Intelligence, 2019, vol. 33, no. 1, pp. 7370–7377.
    [30]
    Y. Liu, Y. Yang, X. Chen, J. Shen, H. Zhang, and Y. Yu, “Improving knowledge tracing via pre-training question embeddings,” arXiv preprint arXiv: 2012.05031, 2020.
    [31]
    H. Nakagawa, Y. Iwasawa, and Y. Matsuo, “Graph-based knowledge tracing: Modeling student proficiency using graph neural network,” in Proc. IEEE/WIC/ACM Int. Conf. Web Intelligence, 2019, pp. 156–163.
    [32]
    S. Tong, Q. Liu, W. Huang, Z. Huang, E. Chen, C. Liu, H. Ma, and S. Wang, “Structure-based knowledge tracing: An influence propagation view,” in Proc. IEEE Int. Conf. Data Mining, 2020, pp. 541–550.
    [33]
    K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, “ELECTRA: Pretraining text encoders as discriminators rather than generators,” arXiv preprint arXiv: 2003.10555, 2020.
    [34]
    T. Gao, X. Yao, and D. Chen, “SimCSE: Simple contrastive learning of sentence embeddings,” arXiv preprint arXiv: 2104.08821, 2021.
    [35]
    K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020, pp. 9729–9738.
    [36]
    H. Hu, H. Wang, Z. Liu, and W. Chen, “Domain-invariant similarity activation map contrastive learning for retrieval-based long-term visual localization,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 2, pp. 313–328, 2021.
    [37]
    X. Xie, F. Sun, Z. Liu, S. Wu, J. Gao, B. Ding, and B. Cui, “Contrastive learning for sequential recommendation,” arXiv preprint arXiv: 2010.14395, 2020.
    [38]
    T. Yao, X. Yi, D. Z. Cheng, et al., “Self-supervised learning for large-scale item recommendations,” in Proc. 30th ACM Int. Conf. Information & Knowledge Management, 2021, pp. 4321–4330.
    [39]
    K. Hassani and A. H. Khasahmadi, “Contrastive multi-view representation learning on graphs,” in Proc. Int. Conf. Machine Learning, 2020, pp. 4116–4126.
    [40]
    H. Hafidi, M. Ghogho, P. Ciblat, and A. Swami, “GraphCL: Contrastive self-supervised learning of graph representations,” arXiv preprint arXiv: 2007.08025, 2020.
    [41]
    X. Song, J. Li, Q. Lei, W. Zhao, Y. Chen, and A. Mian, “Bi-CLKT: Bi-graph contrastive learning based knowledge tracing,” arXiv preprint arXiv: 2201.09020, 2022.
    [42]
    W. Lee, J. Chun, Y. Lee, K. Park, and S. Park, “Contrastive learning for knowledge tracing,” in Proc. ACM Web Conf., 2022, pp. 2330–2338.
    [43]
    C. Zhang, D. Song, C. Huang, A. Swami, and N. V. Chawla, “Heterogeneous graph neural network,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2019, pp. 793–803.
    [44]
    X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graph attention network,” in Proc. World Wide Web Conf., 2019, pp. 2022–2032.
    [45]
    S. Xu, C. Yang, C. Shi, Y. Fang, Y. Guo, T. Yang, L. Zhang, and M. Hu, “Topic-aware heterogeneous graph neural network for link prediction,” in Proc. 30th ACM Int. Conf. Information & Knowledge Management, 2021, pp. 2261–2270.
    [46]
    Y. Sun, J. Han, C. C. Aggarwal, and N. V. Chawla, “When will it happen? Relationship prediction in heterogeneous information networks,” in Proc. 5th Conf. Web Search and Data Mining, 2012, pp. 663–672.
    [47]
    J. Shi, H. Ji, C. Shi, X. Wang, Z. Zhang, and J. Zhou, “Heterogeneous graph neural network for recommendation,” arXiv preprint arXiv: 2009.00799, 2020.
    [48]
    S. Liu, I. Ounis, C. Macdonald, and Z. Meng, “A heterogeneous graph neural model for cold-start recommendation,” in Proc. 43rd Int. ACM SIGIR Conf. Research and Development in Infor. Retrieval, 2020, pp. 2029–2032.
    [49]
    J. Zhao, X. Wang, C. Shi, B. Hu, G. Song, and Y. Ye, “Heterogeneous graph structure learning for graph neural networks,” in Proc. AAAI Conf. Artificial Intelligence, 2021, vol. 35, no. 5, pp. 4697–4705.
    [50]
    S. Garg, V. Perot, N. Limtiaco, A. Taly, E. H. Chi, and A. Beutel, “Counterfactual fairness in text classification through robustness,” in Proc. AAAI/ACM Conf. AI, Ethics, and Society, 2019, pp. 219–226.
    [51]
    M. J. Kusner, J. Loftus, C. Russell, and R. Silva, “Counterfactual fairness,” Advances in Neural Information Processing Systems, vol. 30, pp. 4069–4079, 2017. doi: 10.5555/3294996.3295162
    [52]
    Z. Wang, J. Zhang, H. Xu, X. Chen, Y. Zhang, W. X. Zhao, and J.-R. Wen, “Counterfactual data-augmented sequential recommendation,” in Proc. 44th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, 2021, pp. 347–356.
    [53]
    R. Zmigrod, S. J. Mielke, H. Wallach, and R. Cotterell, “Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology,” arXiv preprint arXiv: 1906.04571, 2019.
    [54]
    T.-J. Fu, X. E. Wang, M. F. Peterson, S. T. Grafton, M. P. Eckstein, and W. Y. Wang, “Counterfactual vision-and-language navigation via adversarial path sampler,” in Proc. European Conf. Computer Vision, 2020, pp. 71–86.
    [55]
    E. Abbasnejad, D. Teney, A. Parvaneh, J. Shi, and A. v. d. Hengel, “Counterfactual vision and language learning,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020, pp. 10044–10054.
    [56]
    L. Chen, X. Yan, J. Xiao, H. Zhang, S. Pu, and Y. Zhuang, “Counterfactual samples synthesizing for robust visual question answering,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2020, pp. 10800–10809.
    [57]
    H. Wang, W. Liang, J. Shen, L. Van Gool, and W. Wang, “Counterfactual cycle-consistent learning for instruction following and generation in vision-language navigation,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, 2022, pp. 15471–15481.
    [58]
    Y. Zhu, Y. Xu, F. Yu, Q. Liu, S. Wu, and L. Wang, “Graph contrastive learning with adaptive augmentation,” in Proc. Web Conf., 2021, pp. 2069–2080.
    [59]
    Y. Jiao, Y. Xiong, J. Zhang, Y. Zhang, T. Zhang, and Y. Zhu, “Sub-graph contrast for scalable self-supervised graph representation learning,” in Proc. IEEE Int. Conf. Data Mining, 2020, pp. 222–231.
    [60]
    Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen, “Graph contrastive learning with augmentations,” Advances in Neural Information Processing Systems, vol. 33, pp. 5812–5823, 2020.
    [61]
    T. Zhao, G. Liu, D. Wang, W. Yu, and M. Jiang, “Learning from counterfactual links for link prediction,” in Proc. Int. Conf. Machine Learning, 2022, pp. 26911–26926.
    [62]
    G. W. Imbens and D. B. Rubin, Causal Inference in Statistics, Social, and Biomedical Sciences. Cambrideg, UK: Cambridge University Press, 2015.
    [63]
    P. C. Austin, “An introduction to propensity score methods for reducing the effects of confounding in observational studies,” Multivariate Behavioral Research, vol. 46, no. 3, pp. 399–424, 2011. doi: 10.1080/00273171.2011.568786
    [64]
    A. Van den Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv: 1807.03748, 2018.
    [65]
    M. Feng, N. Heffernan, and K. Koedinger, “Addressing the assessment challenge with an online system that tutors as it assesses,” User Modeling and User-Adapted Interaction, vol. 19, no. 3, pp. 243–266, 2009. doi: 10.1007/s11257-009-9063-7
    [66]
    J. Stamper, A. Niculescu-Mizil, S. Ritter, G. Gordon, and K. Koedinger, “Algebra I 2005–2006 and bridge to algebra 2006–2007 development data sets from KDD cup 2010 educational data mining challenge,” 2010. [Online], Available: https://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.
    [67]
    M. Zhang, X. Zhu, C. Zhang, Y. Ji, F. Pan, and C. Yin, “Multifactors aware dual-attentional knowledge tracing,” in Proc. 30th ACM Int. Conf. Information & Knowledge Management, 2021, pp. 2588–2597.
    [68]
    A. Ghosh, N. Heffernan, and A. S. Lan, “Context-aware attentive knowledge tracing,” in Proc. 26th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, 2020, pp. 2330–2339.
    [69]
    D. P. Kingma and J. Ba, “ADAM: A method for stochastic optimization,” arXiv preprint arXiv: 1412.6980, 2014.
    [70]
    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The J. Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(6)

    Article Metrics

    Article views (331) PDF downloads(64) Cited by()

    Highlights

    • We focus on the data sparsity issue of knowledge tracing and solve it by leveraging the counterfactual data in an innovatively devised counterfactual contrasting graph learning method, namely GraphCA
    • We obtain the counterfactual positive samples by generating interrupted sub-graphs based on two observational facts and learn an enhanced user representation by a contrastive graph learning method
    • We consider the multiple relationships among students, questions, and concepts in a unified heterogeneous graph to enhance the representations of students by the concepts involved in questions
    • We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our method compared with several state-of-the-art baselines

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return