Student Performance Dataset¶

Proven hypothesis¶

The academic performance of a student depends not only on their academic capabilities but also their socio-economic status.¶

Imagine this...¶

You are a headmaster/headmistress, managing a school of thousands of students. How do you create the ideal environment for students to academically excel in their education?¶

alt text here

ET VOILA!¶

The Student Forecaster Model:¶

https://student-forecaster.streamlit.app/¶

Step 1: Exploratory analysis¶

  • Distribution and spread of the data
  • Dropping columns which at first look are not relevant

Step 2: Creating subgroups of data¶

  • To map out relationships between columns of data

Step 3: Analysis within sub-groups¶

  • To decide the dominating variables most explaining the target variable

Summary of analytical methods used:¶

  • Heatmap correlation
  • Value counts
  • Grouping by bins
  • Histograms
  • Boxplot

Conclusion from data analysis¶

  • The biggest impact to G3 is failure
  • If a student has failed then they likely have additional support
  • Nearly 95% of students want to pursue higher education, so data is heavily skewed
  • Over 90% of students do not have paid support but 60% of students have support at home

Machine Learning:¶

Preprocessing, Modelling, Scoring¶

Classification model used: Gradient Boosting Classifier¶

The classification model chosen:¶

Gradient Boosting Classifier¶

Model Scoring Metrics¶

In [60]:
print("Accuracy:",round(accuracy,2))
print("Precision:", round(precision,2))
print("Recall:", round(recall,2))
print("F1 Score:", round(f1,2))
Accuracy: 0.77
Precision: 0.76
Recall: 0.9
F1 Score: 0.82

Thank you for your attention¶