A critical examination of Machine Learning as a tool to predict performance of students in CS1
Preprint, 2024
erature review of research using machine learning techniques
to predict student performance in introductory programming
courses. The overarching research question is: How does em-
pirical research using machine learning approach the prediction
of student performance in introductory computer science courses
(CS1)? The focus is on how knowledge from educational science
is incorporated alongside with ethical and gender considerations.
Only peer-reviewed articles, published in journals or conference
proceedings between 2017 and mid 2020, reporting on empirical
studies that used data on more than 30 students are included.
This study addresses prevalent shortcomings in empirical CS
education research, noting often inadequate descriptions of data
selection, processing, and the representation and diversity of
sample sizes that can limit the utility of results. It underscores the
frequent omission of ethical considerations regarding students’
data consent and the potential negative impacts on students’
educational trajectories. Additionally, many studies fail to incor-
porate the educational context or address gender-related issues
adequately, disconnecting the models from established knowledge
about women in computer science.
Introductory programming courses
Gender considerations
Machine learning
Ethical considerations
Author
Kristina von Hausswolff
Malardalen University
Christina Björkman
Malardalen University
Areas of Advance
Information and Communication Technology
Subject Categories (SSIF 2011)
Educational Sciences
Computer and Information Science
Driving Forces
Sustainable development
Learning and teaching
Pedagogical work