Forecasting student achievement: The potential of natural language processing


2016. “Forecasting student achievement: The potential of natural language processing” (with Mike Yeomans, Christopher Hulleman & Hunter Gehlbach), In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, 383-387.

Read The Paper


Student intention and motivation are among the strongest predictors of persistence and completion in Massive Open Online Courses (MOOCs), but these factors are typically measured through fixed-response items that constrain student expression. We use natural language processing techniques to evaluate whether text analysis of open responses questions about motivation and utility value can offer additional capacity to predict persistence and completion over and above information obtained from fixed-response items. Compared to simple benchmarks based on demographics, we find that a machine learning prediction model can learn from unstructured text to predict which students will complete an online course. We show that the model performs well out-of-sample, compared to a standard array of demographics. These results demonstrate the potential for natural language processing to contribute to predicting student success in MOOCs and other forms of open online learning.