Intelligent System for Student Performance Prediction Using Machine Learning
Main Article Content
Abstract
Accurately predicting student performance remains a significant challenge in the educational sector. Identifying students who need additional support early can significantly impact their academic outcomes. This study aims to develop an intelligent solution for predicting student performance using supervised machine learning algorithms. This proposed focus on addressing the limitations of existing prediction models and enhancing prediction accuracy. In this work employed three supervised machine learning algorithms: Random Forest, Extra Trees, and K-Nearest Neighbors. The steps of research methodology contained (data collection, preprocessing, feature identification, model construction, and evaluation). This paper utilized a dataset comprising 24,000 training instances and 6,000 testing instances, applying various preprocessing techniques for data optimization. The Extra Trees algorithm achieved the highest accuracy (98.15%), followed by Random Forest (94.03%) and K-Nearest Neighbors (91.65%). All algorithms demonstrated high precision and recall. Notably, K-Nearest Neighbors exhibited exceptional computational efficiency with a training time of 0.00 seconds. This study proposed an efficient model for prediction student performance. The high accuracy and efficiency of the proposed system highlight its potential for application in educational data mining. The findings of this proposed to improving student success rates in educational institutions by enabling timely and appropriate interventions.
Received 24/09/2023
Revised 26/01/2024
Accepted 28/01/2024
Published Online First 20/05/2024
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Farrington CA, Roderick M, Allensworth E, Nagaoka J, Keyes TS, Johnson DW, et al. Teaching Adolescents to Become Learners: The Role of Noncognitive Factors in Shaping School Performance--A Critical Literature Review. ERIC. 2012; 38 p.
Hale EL, Moorman HN. Preparing school principals: A national perspective on policy and program innovations. ERIC. 2003; p 11.
Del Río S, López V, Benítez JM, Herrera F. On the use of mapreduce for imbalanced big data using random forest. Inf Sci (Ny). 2014; 285: 112–37. https://doi.org/10.1016/j.ins.2014.03.043
Thai-Nghe N, Drumond L, Horváth T, Krohn-Grimberghe A, Nanopoulos A, Schmidt-Thieme L. Factorization techniques for predicting student performance. In: Educational recommender systems and technologies: Practices and challenges. IGI Global. 2012; p. 129–53. https://doi.org/10.4018/978-1-61350-489-5.ch006
Daud A, Aljohani NR, Abbasi RA, Lytras MD, Abbas F, Alowibdi JS. Predicting student performance using advanced learning analytics. In: Proc 26th int conf world wide web comp. 2017; p. 415–21. https://doi.org/10.1145/3041021.3054164
Nafea AA, Mishlish M, Muwafaq A, Shaban S, Al-ani MM, Alheeti KMA, et al. Enhancing Student ’ s Performance Classification Using Ensemble Modeling. Iraqi J Comput Sci Math. 2023; 4(4): 204–14. https://doi.org/10.52866/ijcsm.2023.04.04.016
Xu Z, Yuan H, Liu Q. Student performance prediction based on blended learning. IEEE Trans Educ. 2020; 64(1): 66–73. https://doi.org/10.1109/TE.2020.3008751
Pallathadka H, Wenda A, Ramirez-Asís E, Asís-López M, Flores-Albornoz J, Phasinam K. Classification and prediction of student performance data using various machine learning algorithms. Mater today Proc. 2023; 80: 3782–5. https://doi.org/10.1016/j.matpr.2021.07.382
Almarabeh H. Analysis of students’ performance by using different data mining classifiers. Int J Mod Educ Comput Sci. 2017; 9(8): 9. https://doi.org/10.5815/ijmecs.2017.08.02
Al-Shehri H, Al-Qarni A, Al-Saati L, Batoaq A, Badukhen H, Alrashed S, et al. Student performance prediction using support vector machine and k-nearest neighbor. IEEE. Canadian Conference on Electrical and Computer Engineering (CCECE). 2017; p 1–4. https://doi.org/10.1109/CCECE.2017.7946847
Tanuar E, Heryadi Y, Abbas BS, Gaol FL. Using machine learning techniques to earlier predict student’s performance. IEEE. Indonesian Association for Pattern Recognition International Conference (INAPR). 2018;p. 85–9. https://doi.org/10.1109/INAPR.2018.8626856
Hussain M, Zhu W, Zhang W, Abidi SMR, Ali S. Using machine learning to predict student difficulties from learning session data. Artif Intell Rev. 2019; 52: 381–407. https://doi.org/10.14569/IJACSA.2016.070531
Hamoud A, Hashim AS, Awadh WA. Predicting student performance in higher education institutions using decision tree analysis. Int J Interact Multimed Artif Intell. 2018; 5: 26–31. https://ssrn.com/abstract=3243704
Burgos C, Campanario ML, de la Peña D, Lara JA, Lizcano D, Martínez MA. Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Comput Electr Eng. 2018; 66: 541–56. https://doi.org/10.1016/j.compeleceng.2017.03.005
Nagy M, Molontay R. Predicting dropout in higher education based on secondary school performance. IEEE International Conference on Intelligent Engineering Systems (INES).. 2018;p. 389–94. https://doi.org/10.1109/INES.2018.8523888
Vijayalakshmi V, Venkatachalapathy K. Comparison of predicting student’s performance using machine learning algorithms. Int J Intell Syst Appl. 2019; 11(12): 34. https://doi.org/10.5815/ijisa.2019.12.04
Waheed H, Hassan SU, Aljohani NR, Hardman J, Nawaz R. Predicting Academic Performance of Students from VLE Big Data using Deep Learning Models. Computers in Human behavior, 2020, 104: 106189. http://dx.doi.org/10.1016/j.chb.2019.106189
Hasan R, Palaniappan S, Mahmood S, Abbas A, Sarker KU, Sattar MU. Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl Sci. 2020; 10(11): 3894. https://doi.org/10.3390/app10113894
Kemper L, Vorhoff G, Wigger BU. Predicting student dropout: A machine learning approach. Eur J High Educ. 2020; 10(1): 28–47. https://doi.org/10.1080/21568235.2020.1718520
Mubarak AA, Cao H, Ahmed SAM. Predictive learning analytics using deep learning model in MOOCs’ courses videos. Educ Inf Technol. 2021; 26(1): 371–92. https://doi.org/10.1007/s10639-020-10273-6
Sakri S, Alluhaidan AS. RHEM: A robust hybrid ensemble model for students’ performance assessment on cloud computing course. Int J Adv Comput Sci Appl. 2020; 11: 388–96. https://doi.org/10.14569/IJACSA.2020.0111150
Alhassan A, Zafar B, Mueen A. Predict students’ academic performance based on their assessment grades and online activity data. Int J Adv Comput Sci Appl. 2020; 11(4). https://doi.org/10.14569/IJACSA.2020.0110425
Adnan M, Habib A, Ashraf J, Mussadiq S, Raza AA, Abid M, et al. Predicting at-risk students at different percentages of course length for early intervention using machine learning models. Ieee Access. 2021; 9: 7519–39. https://doi.org/10.1109/ACCESS.2021.3049446
Rodríguez-Hernández CF, Musso M, Kyndt E, Cascallar E. Artificial neural networks in academic performance prediction: Systematic implementation and predictor evaluation. Comput Educ Artif Intell. 2021; 2: 100018. https://doi.org/10.1016/j.caeai.2021.100018
Kumar M, Mehta G, Nayar N, Sharma A. EMT: Ensemble meta-based tree model for predicting student performance in academics. IOP Conf Ser.: Mater Sci Eng. IOP Publishing; 2021. p. 12062. https://doi.org/10.1088/1757-899X/1022/1/012062
Yağcı M. Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learn Environ. 2022; 9(1): 11. https://doi.org/10.1186/s40561-022-00192-z
Alboaneen D, Almelihi M, Alsubaie R, Alghamdi R, Alshehri L, Alharthi R. Development of a web-based prediction system for students’ academic performance. Data. 2022; 7(2): 21. https://doi.org/10.3390/data7020021
Gaftandzhieva S, Talukder A, Gohain N, Hussain S, Theodorou P, Salal YK, et al. Exploring online activities to predict the final grade of student. Mathematics. 2022; 10(20): 3758. https://doi.org/10.3390/math10203758
Abdullah M, Al-Ayyoub M, Shatnawi F, Rawashdeh S, Abbott R. Predicting students’ academic performance using e-learning logs. IAES. Int J Artif Intell. 2023; 12(2): 831. https://doi.org/10.11591/ijai.v12.i2.pp831-839
Kareem AK, Al-ani MM, Nafea AA. Detection of Autism Spectrum Disorder Using A 1-Dimensional Convolutional Neural Network. Baghdad Sci J. 2023; 20(3): 1182–93. https://doi.org/10.21123/bsj.2023.8564
Nafea AA, Omar N, Al-qfail ZM. Artificial Neural Network and Latent Semantic Analysis for Adverse Drug Reaction Detection. Baghdad Sci J. 2024; 21(1): 226-33 . https://doi.org/10.21123/bsj.2023.7988
Sharaff A, Gupta H. Extra-tree classifier with metaheuristics approach for email classification. In: Advances in Computer Communication and Computational Sciences: Proceedings of IC4S 2018. Springer. 2019;p. 189–97. https://doi.org/10.1007/978-981-13-6861-5_17
Chu Z, Yu J, Hamdulla A. Throughput prediction based on extratree for stream processing tasks. Comput Sci Inf Syst. 2021; 18(1): 1–22. https://doi.org/10.2298/CSIS200131031C
Bhati BS, Rai CS. Ensemble based approach for intrusion detection using extra tree classifier. In: Intelligent Computing in Engineering: Select Proceedings of RICE 2019. Springer. 2020; p. 213–20. https://doi.org/10.1007/978-981-15-2780-7_25
Pinto A, Pereira S, Correia H, Oliveira J, Rasteiro DMLD, Silva CA. Brain tumour segmentation based on extremely randomized forest with high-level features. Annu Int Conf IEEE Eng Med Biol Soc . IEEE. 2015; p. 3037–40. https://doi.org/10.1109/EMBC.2015.7319032
Alsumaidaie MSI, Alheeti KMA, Alaloosy AK. An Assessment of Ensemble Voting Approaches, Random Forest, and Decision Tree Techniques in Detecting Distributed Denial of Service (DDoS) Attacks. Iraqi J Electr Electron Eng. 2023; 20(1) : p16-24. https://doi.org/10.37917/ijeee.20.1.2
Devetyarov D, Nouretdinov I. Prediction with confidence based on a random forest classifier. Conference Artificial Intelligence Applications and Innovations: 6th IFIP WG 125 Int Conf, AIAI 2010, Larnaca, Cyprus, October 6-7, 2010 Proc 6. Springer; 2010. p. 37–44. https://doi.org/10.1007/978-3-642-16239-8_8
Boulesteix A, Janitza S, Kruppa J, König IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdiscip Rev Data Min Knowl Discov. 2012; 2(6) :493–507. https://doi.org/10.1002/widm.1072
Lubis AR, Lubis M. Optimization of distance formula in K-Nearest Neighbor method. Bull Electr Eng Inform. 2020; 9(1): 326–38. https://doi.org/10.11591/eei.v9i1.1464
Al-Khowarizmi RS, Nasution MKM, Elveny M. Sensitivity of MAPE using detection rate for big data forecasting crude palm oil on k-nearest neighbor. Int J Electr Comput Eng. 2021; 11(3): 2696–703. https://doi.org/10.11591/ijece.v11i3.pp2696-2703
Rafiee M. Self-organization map (SOM) algorithm for DDoS attack detection in distributed software defined network (D-SDN). J Inf Syst Telecommun. 2022; 2(38): 120. https://doi.org/10.52547/jist.15644.10.38.120
Alsumaidaie MSI, Alheeti KMA, Al-Aloosy AK. Intelligent Detection System for a Distributed Denial-of-Service (DDoS) Attack Based on Time Series. DeSE. IEEE; 2023. p. 445–50. https://doi.org/10.52866/ijcsm.2023.02.03.002
Giuffrida D, Benetti G, De Martini D, Facchinetti T. Fall detection with supervised machine learning using wearable sensors. IEEE Int Conf Ind Inform. ; 2019. p. 253–9.https://doi.org/10.1109/INDIN41052.2019.8972246
AL-Ani MM, Omar N, Nafea AA. A Hybrid Method of Long Short-Term Memory and Auto-Encoder Architectures for Sarcasm Detection. J Comput Sci. 2021; 17(11): 1093–8. https://doi.org/10.3844/jcssp.2021.1093.1098
Alsumaidaie MSI, Alheeti KMA, Alaloosy AK. Intelligent Detection of Distributed Denial of Service Attacks: A Supervised Machine Learning and Ensemble Approach. Iraqi J Comput Sci Math. 2023; 4(3): 12–24. DOI: https://doi.org/10.52866/ijcsm.2023.02.03.002
Nafea AA, Omar N, AL-Ani MM. Adverse Drug Reaction Detection Using Latent Semantic Analysis. J Comput Sci. 2021; 17(10): 960–70. https://doi.org/10.3844/jcssp.2021.960.970