A Machine Learning and Life-Course Approach to Protecting Student Trajectories
Therefore, any education system should try to estimate the risk students have to interrupt their trajectory to protect them from interruption. Exploiting the vast potential of administrative data from school systems and current advanced data science tools and machine learning techniques, we propose developing a Risk of Trajectory Disruption Index (RTDI), which estimates the probability of occurrence of any of these disruptive events for each student in the school system: Chronic absenteeism, grade repetition, and school dropout.
To create the RTDI, we train machine learning algorithms (specifically, a gradient boost forest model) with individual trajectories of students created using the data available from the General Student Information System (SIGE) in Chile. Each trajectory considers 73 academic and demographic variables created from a longitudinal analysis of each student trajectory.
By estimating individual risk based on their life course and using students' geolocation, we made visualizations that identify neighborhoods where a higher risk of students suffering trajectories’ interruption concentrates. This is of utmost importance, given that in the Chilean educational system, each family can choose the school without restrictions as to where they live, and the challenge for decision-makers is taking action at the local level when students leave school. Another application uses the RTDI along with model calibration techniques to estimate the effect of each school in protecting the student trajectories to identify schools that need additional support or learn from them for replicating and scaling up actions for protecting trajectories.