Student Performance Data Set

The Benchmark has the following main features:
• Data approaches student achievement in secondary education of two Portuguese schools.
• This dataset can be used both for regression task, ex. predicting student performance, and classification tasks.
• The data attributes include student grades, demographic, social and school related features and it have been collected by using school reports and questionnaires.
• The dataset collects the performance of students in two distinct subjects: Mathematics and Portuguese language.
• The target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades.
• There are both numeric and categorical attributes.
• Data have 33 features and 629 instances.

Scientific Area:
Machine Learning

C, C++, MatLab, Octave, Python, R

Target Group:

Cite as:
Cortez, P. and Silva, A., Using Data Mining to Predict Secondary School Student Performance, Proceedings of 5th FUture BUsiness TEChnology Conference, (2008): 5-12.

Author of the review:
Giulia Cademartori
University of Genoa


