Can A Language Model Represent Math Strategies?: Learning Math Strategies from Big Data using BERT

Academic Article

Describes a BERT-based approach that uses student interaction data from MATHia to learn math strategy representations and predict whether students are likely to apply correct strategies to novel problems.

Visit Resource
This link will take you to an external website.
Purpose/Abstract

AI models have shown a remarkable ability to perform representation learning using large-scale data. In particular, the emergence of Large Language Models (LLMs) attests to the capability of AI models to learn complex hidden structures in a bottom-up manner without requiring a lot of human expertise.

In this paper, we leverage these models to learn math learning strategies at scale. Specifically, we use student interaction data from the MATHia Intelligent Tutoring System to learn strategies based on sequences of actions performed by students. To do this, we develop an AI model based on BERT (Bidirectional Encoder Representations from Transformers) that has two main components.

First, we pre-train BERT using an approach known as Masked Language Modeling to learn embeddings for strategies. The embeddings represent strategies in vector form while preserving their semantics. Next, we fine-tune the model to predict whether students are likely to apply a correct strategy to solve a novel problem.

We demonstrate, using a large dataset collected from 655 schools, that our approach, in which we pre-train to learn strategies from a sample of schools, can be fine-tuned with a small number of examples to make accurate predictions over student data collected from other schools.

Citation
Magar, A., Shakya, A., Fancsali, S., Rus, V., Murphy, A., Ritter, S., Venugopal, D. (2025). “Can A Language Model Represent Math Strategies?”: Learning Math Strategies from Big Data using BERT. LAK '25: Proceedings of the 15th International Learning Analytics and Knowledge Conference, 655–666. https://doi.org/10.1145/3706468.3706558

Areas researched: Student Learning, AI

Previous
Previous

Automated Feedback Improves Teachers’ Questioning Quality in Brick-and-Mortar Classrooms: Opportunities for Further Enhancement

Next
Next

De-identifying Student Personally Identifying Information in Discussion Forum Posts With Large Language Models