Abstract:
Student academic performance is an important indicator for measuring the success of learning processes in educational institutions. This study aims to apply the C4.5 algorithm to predict student performance at MTs PGII Banjar based on academic and non-academic factors. This research uses a quantitative approach with computational experimental methods following the CRISP-DM methodology. The research data were obtained from 652 students of MTs PGII Banjar for the academic years 2021/2022-2023/2024 selected using purposive sampling technique. Research variables include academic factors (subject grades, attendance) and non-academic factors (learning motivation, parental support, socioeconomic status). The C4.5 algorithm implementation was conducted using RapidMiner Studio with parameters of minimum instances per leaf = 5, confidence factor = 0.25, and minimum gain threshold = 0.01. The results show that the prediction model using the C4.5 algorithm achieved an accuracy of 78.68%, precision of 77.84%, recall of 78.12%, and F1-score of 77.98%. The AUC-ROC value of 0.842 indicates excellent model discrimination capability. Validation using 10-fold cross validation demonstrated consistent performance with low standard deviation (0.57%). Information gain analysis shows Mathematics grade as the strongest predictor (0.847), followed by Science grade (0.723), attendance level (0.689), and learning motivation (0.634). The generated decision tree identified 23 classification rules with an average confidence of 84.2% that can be interpreted as an early warning system for identifying at-risk students. This model can be implemented as a decision support system to improve academic management quality through data-driven decision making.