Student Dropout and Graduation Prediction

Student Dropout and Graduation Prediction

To address the issue of student dropout, a machine learning model using a Random Forest Classifier will be developed to identify contributing factors and predict outcomes based on specific features. A data visualization dashboard and in-depth analytical report will be created to support this effort. The dashboard, built with Metabase, will display insights from the machine learning model, which will also be made accessible through Streamlit, allowing users to test the model's performance interactively. Before model development, it is essential to understand the dataset, which includes various factors related to students' backgrounds and academic performance. Relevant features include marital status, application mode, course, attendance shifts, previous qualifications and grades, nationality, parental education and occupation, admission scores, displacement status, special education needs, debt status, tuition payment status, gender, scholarship status, age at enrollment, and international student status. Academic performance indicators such as total credits, number of enrolled and approved courses, evaluations, and GPA in the first and second semesters are also analyzed to better understand and prevent dropout behavior.

Tech Stack

PySparkRandom Forest RegressorMetabaseStreamlit

Skills Utilized

Machine LearningData ScienceData UnderstandingData Visualization