Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Education and Culture Executive Agency (EACEA). Neither the European Union nor EACEA can be held responsible for them.


Student Performance Data Set

Users: 2 - Average Rating: 5.00

The Benchmark has the following main features:
• Data approaches student achievement in secondary education of two Portuguese schools.
• This dataset can be used both for regression task, ex. predicting student performance, and classification tasks.
• The data attributes include student grades, demographic, social and school related features and it have been collected by using school reports and questionnaires.
• The dataset collects the performance of students in two distinct subjects: Mathematics and Portuguese language.
• The target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades.
• There are both numeric and categorical attributes.
• Data have 33 features and 629 instances.

Scientific Area:
Machine Learning

C, C++, MatLab, Octave, Python, R

Target Group:

Cite as:
Cortez, P. and Silva, A., Using Data Mining to Predict Secondary School Student Performance, Proceedings of 5th FUture BUsiness TEChnology Conference, (2008): 5-12.

Author of the review:
Giulia Cademartori
University of Genoa


You have to login to leave a comment. If you are not registered click here

Mauro Bozzetti

Great resource

Maria de Fátima Pacheco

Available in several environments: great value!