KNN Vs Naive Bayes: An Innovative Comparison in Predictive AI Learning With Association Data Support

Devi Miftahul  Jannah; Aprilianti Nirmala S

doi:10.61220/digitech.v3i1.20251

Authors

Devi Miftahul Jannah State University of Makassar Author
Aprilianti Nirmala S State University of Makassar Author

DOI:

https://doi.org/10.61220/digitech.v3i1.20251

Keywords:

Association Rule Mining, KNN, Naive Bayes, Predictive Learning, Artificial Intellegents

Abstract

This study analyzes how Naive Bayes and K-Nearest Neighbor (KNN) predict learning outcomes based on artificial intelligence (AI). The main focus of this study is the difficulty of algorithms in handling complex learning data and the contribution of Association Rule Mining (ARM) attribute features in improving prediction accuracy. The methods applied include two classification algorithms (KNN and Naive Bayes) in an exploratory-comparative quantitative research design, as well as the application of ARM to uncover hidden patterns among variables using the apriori algorithm. Data for 368 students with prior experience in artificial intelligence technology was collected through an online survey. Although KNN outperforms in recall, the study results show that Naive Bayes has higher precision. By detecting hidden correlation patterns that cannot be identified by conventional classification methods, ARM improves classification results. The discussion emphasizes that the selection of the best algorithm depends on the application's objectives, namely whether the priority is on classification accuracy or the range of relevant results. Based on these findings, a hybrid technique combining KNN, Naive Bayes, and ARM is highly recommended for creating a more efficient and accurate prediction system to support AI-based education.

References

[1] A. Anwarudin, W. Andriyani, B. P. DP, and D. Kristomo, “The Prediction on the Students’ Graduation Timeliness Using Naive Bayes Classification and K-Nearest Neighbor,” J. Intell. Softw. Syst., vol. 1, no. 1, p. 75, 2022, doi: 10.26798/jiss.v1i1.597.

[2] Jiao, “AI-Driven Personalization in Higher Education: Enhancing Learning Outcomes through Adaptive Technologies,” Adult High. Educ., vol. 6, no. 6, 2024, doi: 10.23977/aduhe.2024.060607.

[3] A. Akavova, Z. Temirkhanova, and Z. Lorsanova, “Adaptive learning and artificial intelligence in the educational space,” E3S Web Conf., vol. 451, pp. 1–4, 2023, doi: 10.1051/e3sconf/202345106011.

[4] A. Lei, “Use Cases of AI in Pharmaceutical Industry.” [Online]. Available: https://www.linkedin.com/pulse/use-cases-ai-pharmaceutical-industry-seema-karwa/

[5] M. Munarsih and B. A. Ningsi, “Performance Comparison of Data Mining Classification Algorithms on Student Academic Achievement Prediction,” Indones. J. Artif. Intell. Data Min., vol. 6, no. 1, p. 29, 2023, doi: 10.24014/ijaidm.v6i1.21874.

[6] A. A. Permana and A. Arsanah, “Comparison of K-Nearest Neighbor Classification and Naive Bayes Classifier in Analysis of Heart Disease,” J. Intell. Comput. Heal. Informatics, vol. 5, no. 1, pp. 35–42, 2024, doi: 10.26714/jichi.v5i1.11251.

[7] T. Sun, “Importance of Feature Extraction in Naïve Bayes Applied on Animal Classification,” Highlights Sci. Eng. Technol., vol. 124, pp. 244–248, 2025, doi: 10.54097/6dp2f003.

[8] I. A. A. Amra and A. Y. A. Maghari, “Students performance prediction using KNN and Naïve Bayesian,” in ICIT 2017 - 8th International Conference on Information Technology, Proceedings, in, Ed., Amman: IEEE, 2017, pp. 909–913. doi: 10.1109/ICITECH.2017.8079967.

[9] S. Alneyadi and Y. Wardat, “ChatGPT: Revolutionizing student achievement in the electronic magnetism unit for eleventh-grade students in Emirates schools,” Contemp. Educ. Technol., vol. 15, no. 4, 2023, doi: 10.30935/cedtech/13417.

[10] R. N. Ichsan, L. Nasution, and S. Sinaga, “Studi kelayakan bisnis = Business feasibility study,” CV. Sentosa Deli Mandiri.

[11] S. M. Ramesh, N. Sengottaiyan, D. Vanathi, R. Manoja, K. Tamizharasu, and P. Kalyanasundaram, “Prediction of Cardiac Disease Using Naive Bayes Algorithm,” in 2nd International Conference on Sustainable Computing and Smart Systems, ICSCSS 2024 - Proceedings, India: IEEE, 2024, pp. 994–997. doi: 10.1109/ICSCSS60660.2024.10624905.

[12] L. Chowdhury et al., “A biological data-driven mining technique by using hybrid classifiers with rough set,” Int. J. Ambient Comput. Intell., vol. 12, no. 3, pp. 123–139, 2021, doi: 10.4018/IJACI.2021070106.

[13] A. Alaiad, H. Najadat, B. Mohsen, and K. Balhaf, “Classification and Association Rule Mining Technique for Predicting Chronic Kidney Disease,” J. Inf. Knowl. Manag., vol. 19, no. 1, p. 2040015, 2020, doi: 10.1142/S0219649220400158.

[14] P. Kaur et al., “Does ChatGPT Add Value to University Students’ Academic Performance?,” Int. J. Membr. Sci. Technol., vol. 10, no. 3, pp. 2800–2811, 2023, doi: 10.15379/ijmst.v10i3.2583.

[15] F. Al Azami, A. A. Riadi, and E. Evanita, “Klasifikasi Kualitas Wortel Menggunakan Metode K-Nearest Neighbor Berbasis Android,” Jurasik (Jurnal Ris. Sist. Inf. dan Tek. Inform., vol. 7, no. 1, p. 36, 2022, doi: 10.30645/jurasik.v7i1.413.

[16] A. Winantu and C. Khatimah, “Perbandingan Metode Klasifikasi Naive Bayes Dan K-Nearest Neighbor Dalam Memprediksi Prestasi Siswa,” INTEK J. Inform. dan Teknol. Inf., vol. 6, no. 1, pp. 58–64, 2023, doi: 10.37729/intek.v6i1.3006.

[17] M. Zoromba, H. E. El-Gazar, A. Salah, H. O. Elboraie, A.-H. El-Gilany, and A. H. El-Monshed, “Effects of Emotional Intelligence Training on Symptom Severity in Patients With Depressive Disorders,” Clin. Nurs. Res., 2022, doi: 10.1177/10547738221074065.

[18] A. W. Syaputri, E. Irwandi, and M. Mustakim, “Naïve Bayes Algorithm for Classification of Student Major’s Specialization,” J. Intell. Comput. Heal. Informatics, vol. 1, no. 1, p. 17, 2020, doi: 10.26714/jichi.v1i1.5570.

[19] A. Moubayed, M. Injadat, A. Shami, and H. Lutfiyya, “Relationship between student engagement and performance in e-learning environment using association rules,” in EDUNINE 2018 - 2nd IEEE World Engineering Education Conference: The Role of Professional Associations in Contemporaneous Engineer Careers, Proceedings, Buenos Aires: IEEE, 2018, pp. 1–6. doi: 10.1109/EDUNINE.2018.8451005.

[20] S. A. Hudli, A. V. Hudli, and A. A. Hudli, “Application of data mining to candidate screening,” in Proceedings of 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies, ICACCCT 2012, Ramanathapuram, India: IEEE, 2012, pp. 287–290. doi: 10.1109/ICACCCT.2012.6320788.

[21] G. R. Thummala, R. Baskar, and N. Thiyaneswaran, “Prediction of Heart Disease Using Naive Bayes in Comparison with KNN Based on Accuracy,” in International Conference on Cyber Resilience, ICCR 2022, Dubai, United Arab Emirates: IEEE, 2022, pp. 1–4. doi: 10.1109/ICCR56254.2022.9995841.

[22] A. A. M. Ahmed et al., “Introductory Engineering Mathematics Students’ Weighted Score Predictions Utilising a Novel Multivariate Adaptive Regression Spline Model,” Sustain., vol. 14, no. 17, p. 11070, 2022, doi: 10.3390/su141711070.

[23] M. Si et al., “Triglycerides as Biomarker for Predicting Systemic Lupus Erythematosus Related Kidney Injury of Negative Proteinuria,” Biomolecules, vol. 12, no. 7, pp. 1–12, 2022, doi: 10.3390/biom12070945.

[24] D. Dwiputra, A. Mulyo Widodo, H. Akbar, and G. Firmansyah, “Evaluating the Performance of Association Rules in Apriori and FP-Growth Algorithms: Market Basket Analysis to Discover Rules of Item Combinations,” J. World Sci., vol. 2, no. 8, pp. 1229–1248, 2023, doi: 10.58344/jws.v2i8.403.

[25] E. Herath and U. Wijenayake, “A Novel Approach to Enhance the Efficiency of Apriori Algorithm,” Proc. Conf. Transdiscipl. Res. Eng., vol. 1, no. 1, 2024, doi: 10.31357/contre.v1i1.7372.

[26] B. H. Situmorang, A. Isra, D. Paragya, and D. A. A. Adhieputra, “Apriori Algorithm Application for Consumer Purchase Patterns Analysis,” Komputasi J. Ilm. Ilmu Komput. dan Mat., vol. 21, no. 1, pp. 15–20, 2024, doi: 10.33751/komputasi.v21i1.9260.

[27] Y. Ariyanto, B. Harijanto, and A. N. Asri, “Analyzing Student’s Learning Interests in the Implementation of Blended Learning Using Data Mining,” Int. J. Online Biomed. Eng., 2020, doi: 10.3991/ijoe.v16i11.16453.

[28] K. Pawluszek-Filipiak and A. Borkowski, “On the importance of train-test split ratio of datasets in automatic landslide detection by supervised classification,” Remote Sens., vol. 12, no. 18, 2020, doi: 10.3390/rs12183054.

[29] A. D. W. M. Sidik, I. Himawan Kusumah, A. Suryana, Edwinanto, M. Artiyasa, and A. Pradiftha Junfithrana, “Gambaran Umum Metode Klasifikasi Data Mining,” Fidel. J. Tek. Elektro, vol. 2, no. 2, pp. 34–38, 2020, doi: 10.52005/fidelity.v2i2.111.

[30] J. Smucny et al., “Predicting conversion to psychosis using machine learning: response to Cannon,” Front. Psychiatry, vol. 15, no. January, pp. 1–6, 2024, doi: 10.3389/fpsyt.2024.1520173.

[31] P. Subarkah, W. R. Damayanti, and R. A. Permana, “Comparison of Correlated Algorithm Accuracy Naive Bayes Classifier and Naive Bayes Classifier for Classification of Heart Failure,” Ilk. J. Ilm., 2022, doi: 10.33096/ilkom.v14i2.1148.120-125.

[32] M. Zuo, B. Yu, and L. Sui, “Application of EEG-based Machine Learning in Stock Trading-related Emotion Recognition,” 2024, doi: 10.4108/eai.27-10-2023.2342016.

[33] N. Van Thieu, “PerMetrics: A Framework of Performance Metrics for Machine Learning Models,” J. Open Source Softw., vol. 9, no. 95, p. 6143, 2024, doi: 10.21105/joss.06143.

[34] G. Mostafa, H. Mahmoud, T. Abd El-Hafeez, and M. E. ElAraby, “Feature reduction for hepatocellular carcinoma prediction using machine learning algorithms,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-024-00944-3.

[35] A. Zeng, H. Yu, Q. Da, Y. Zhan, and C. Miao, “Accelerating ranking in e-commerce search engines through contextual factor selection,” AAAI 2020 - 34th AAAI Conf. Artif. Intell., pp. 13212–13219, 2020, doi: 10.1609/aaai.v34i08.7026.

[36] L. Hamdad and K. Benatchba, “Association Rules Mining Exact, Approximate and Parallel Methods: A Survey,” SN Comput. Sci., vol. 2, no. 6, 2021, doi: 10.1007/s42979-021-00819-x.

[37] P. Mehrannia, B. Moshiri, and O. Basir, “Knowledgebase approximation using association rule aggregation,” Int. J. Data Sci. Anal., vol. 13, no. 3, pp. 225–237, 2022, doi: 10.1007/s41060-021-00304-x.

[38] M. Peng, S. Lee, A. G. D’Souza, C. T. A. Doktorchik, and H. Quan, “Development and validation of data quality rules in administrative health data using association rule mining,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, 2020, doi: 10.1186/s12911-020-1089-0.

[39] B. Bouaita, A. Beghriche, A. Kout, and A. Moussaoui, “A New Approach for Optimizing the Extraction of Association Rules,” Eng. Technol. Appl. Sci. Res., vol. 13, no. 2, pp. 10496–10500, 2023, doi: 10.48084/etasr.5722.

[40] A. Mokkadem, M. Pelletier, and L. Raimbault, “Association rules and decision rules,” Stat. Anal. Data Min., vol. 16, no. 5, pp. 411–435, 2023, doi: 10.1002/sam.11620.

[41] F. Bao, L. Mao, Y. Zhu, C. Xiao, and C. Xu, “An Improved Evaluation Methodology for Mining Association Rules,” Axioms, vol. 11, no. 1, 2022, doi: 10.3390/axioms11010017.

[42] Q. Liu, Q. Dou, L. Yu, and P. A. Heng, “MS-Net: Multi-Site Network for Improving Prostate Segmentation with Heterogeneous MRI Data,” IEEE Trans. Med. Imaging, vol. 39, no. 9, pp. 2713–2724, 2020, doi: 10.1109/TMI.2020.2974574.

[43] L. Szathmary, “Closed association rules,” Ann. Math. Informaticae, vol. 51, pp. 65–76, 2020, doi: 10.33039/ami.2020.07.009.

[44] D. W. Wardani, “Measuring positive and negative association of apriori algorithm with cosine correlation analysis,” Baghdad Sci. J., vol. 18, no. 3, pp. 554–564, 2021, doi: 10.21123/BSJ.2021.18.3.0554.

[45] N. Kim, H. Oh, and J. K. Choi, “A privacy scoring framework: Automation of privacy compliance and risk evaluation with standard indicators,” J. King Saud Univ. - Comput. Inf. Sci., vol. 35, no. 1, pp. 514–525, 2023, doi: 10.1016/j.jksuci.2022.12.019.

[46] S. H. Jin, K. Im, M. Yoo, I. Roll, and K. Seo, “Supporting students’ self-regulated learning in online learning using artificial intelligence applications,” Int. J. Educ. Technol. High. Educ., vol. 20, no. 1, p. 37, 2023, doi: 10.1186/s41239-023-00406-5.

[47] I. Riadi, R. Umar, and R. Anggara, “Comparative Analysis of Naive Bayes and K-NN Approaches to Predict Timely Graduation using Academic History,” Int. J. Comput. Digit. Syst., vol. 16, no. 1, pp. 1163–1174, 2024, doi: 10.12785/ijcds/160185.