×
Home Current Archive Editorial board
Instructions for papers
For Authors Aim & Scope Contact
Original scientific article

ADVANCING DIABETES PREDICTION THROUGH MACHINE LEARNING AND DEEP LEARNING MODELS USING PIMA INDIAN AND CLINICAL-BIOLOGICAL DATA

By
Zeeshan Hussain Orcid logo ,
Zeeshan Hussain

Jamia Hamdard University India

Suraiya Parveen Orcid logo ,
Suraiya Parveen

Jamia Hamdard University India

Ashif Khan Orcid logo ,
Ashif Khan

Jamia Hamdard University India

Ihtiram Raza Orcid logo ,
Ihtiram Raza

Jamia Hamdard University India

Umnah Orcid logo
Umnah

Jamia Millia Islamia India

Abstract

Diabetes Mellitus is a significant world health and early detection is of paramount significance since it decreases the complications and enables medical intervention in time. The paper is a comparison between the predictive accuracy of the eight Machine Learning classifiers: Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, Gradient Boosting, Naive Bayes, k-Nearest Neighbors (k-NN), and an Ensemble model on the Pima Indian Diabetes dataset and a collection of clinical-biological patient records. Performance evaluation was conducted using Precision, Recall, F1-Score, and the Area Under the ROC Curve (AUC-ROC). The findings show that a significant difference was observed among the models, with SVM (AUC-ROC: 0.8648) and the Logistic Regression (AUC-ROC: 0.8638) having the best discriminative ability. A comparable study found that Logistic Regression had the highest Precision (0.7632), indicating fewer false-positive predictions, whereas Decision Tree had the highest Recall (0.7447), indicating greater sensitivity in detecting diabetes cases. The ensemble learning produced the best overall performance (AUC-ROC: 0.8709), suggesting that combining predictions from multiple models increases reliability and generalization. On the other hand, k-NN performed worst due to sensitivity to noise and the number of features. In general, the results provide evidence of the high potential of linear-margin and ensemble-based models to structured clinical data and would be a robust foundation of clinical decision support systems, which further help to broaden the role of ML-based analytics in early diabetes diagnosis and preventive health care planning.

References

1.
Taskinen MR. Diabetic dyslipidaemia: from basic research to clinical practice. Diabetologia. 2003 Jun;46(6):733–49.
2.
Saratha B, Radhika MS, Priya VS. An Approach Towards Diabetic Retinopathy Detection and Analysis Through Cognitive Computing. Archives for Technical Sciences. 2025 J1(33): 125–134.
3.
Ganie AH et al. Robust diabetic prediction using ensemble machine learning techniques with SMOTE. Scientific Reports. 2023.
4.
Vij P, Prashant PM. Predicting aquatic ecosystem health using machine learning algorithms. International Journal of Aquatic Research and Environmental Studies. 2024;4(S1):39–44.
5.
Ganie S, Malik MB, Arif T. Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches. Journal of Diabetes & Metabolic Disorders. 2022 Jun;21(1):339–52.
6.
Nithyalakshmi V, Sivakumar R, Sivaramakrishnan A. Automatic detection and classification of diabetes using artificial intelligence. International Academic Journal of Innovative Research. 2021;8(1):1–5.
7.
Sharma T, Shah M. A comprehensive review of machine learning techniques on diabetes detection. Visual Computing for Industry, Biomedicine, and Art. 2021 Dec 3;4(1):30.
8.
Kumar V, Shah M. Multi Disease Prediction Using Deep Learning Framework for Electric Health Record. International Academic Journal of Science and Engineering. 2021;8(4):24–8.
9.
Tasin I, Nabil TU, Islam S, Khan R. Diabetes prediction using machine learning and explainable AI techniques. Healthcare technology letters. 2023 Feb;10(1–2):1–10.
10.
Debebe B. Levels, trends and determinants of under-five mortality in Amhara Region, Ethiopia: evidence from Demographic and Health Survey (2000-2011). International Academic Journal of Social Sciences. 2016;3(2):96–112.
11.
Afsaneh E, Sharifdini A, Ghazzaghi H, Ghobadi MZ. Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: a comprehensive review. Diabetology & Metabolic Syndrome. 2022 Dec 27;14(1):196.
12.
Shin J, Lee J, Ko T, Lee K, Choi Y, Kim HS. Improving machine learning diabetes prediction models for the utmost clinical effectiveness. Journal of Personalized Medicine. 2022 Nov 14;12(11):1899.
13.
Fomekong RL, Saruhan B. Titanium based materials for high-temperature gas sensor in harsh environment application. Chemistry Proceedings. 2021 Jun 30;5(1):66.
14.
Kiran M, Xie Y, Anjum N, Ball G, Pierscionek B, Russell D. Machine learning and artificial intelligence in type 2 diabetes prediction: a comprehensive 33-year bibliometric and literature analysis. Frontiers in Digital Health. 2025 Mar 27;7:1557467.
15.
Qin L. A Prediction Model of Diabetes Based on Ensemble Learning. InProceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition 2022 Sep 23 (pp. 45-51).
16.
Hasan R, Dattana V, Mahmood S, Hussain S. Towards transparent diabetes prediction: combining automl and explainable AI for improved clinical insights. Information. 2024 Dec 26;16(1):7.
17.
Kaliappan J, Saravana Kumar IJ, Sundaravelan S, Anesh T, Rithik RR, Singh Y, et al. Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets. Frontiers in Artificial Intelligence. 2024 Aug 21;7:1421751.
18.
Zhao M, Yao Z, Zhang Y, Ma L, Pang W, Ma S, et al. Predictive value of machine learning for the progression of gestational diabetes mellitus to type 2 diabetes: a systematic review and meta-analysis. BMC Medical Informatics and Decision Making. 2025 Jan 13;25(1):18.
19.
Khokhar PB, Gravino C, Palomba F. Advances in artificial intelligence for diabetes prediction: insights from a systematic literature review. Artificial Intelligence in Medicine. 2025 Apr 15:103132.
20.
Dutta A, Hasan MK, Ahmad M, Awal MA, Islam MA, Masud M, et al. Early prediction of diabetes using an ensemble of machine learning models. International Journal of Environmental Research and Public Health. 2022 Sep 28;19(19):12378.
21.
Chowdhury P, Barua P, Uddin MN. Diabetes prediction using machine learning and hybrid deep learning ensemble technique. In2024 IEEE International Conference on Computing, Applications and Systems (COMPAS) 2024 Sep 25 (pp. 1-7). IEEE.
22.
Yan D, Li X, Wang Y, Cai Z. Optimized prediction of diabetes complications using ensemble learning with Bayesian optimization: a cost-efficient laboratory-based approach. Frontiers in Endocrinology. 2025 Jun 20;16:1593068.
23.
Sethi H, Goraya A, Sharma V. Artificial Intelligence based Ensemble Model for Diagnosis of Diabetes. International Journal of Advanced Research in Computer Science. 2017 May 15;8(5).
24.
Fregoso-Aparicio L, Noguez J, Montesinos L, García-García J. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetology & metabolic syndrome. 2021 Dec 20;13(1):148.
25.
Firdous S, Wagai GA, Sharma K. A survey on diabetes risk prediction using machine learning approaches. Journal of Family Medicine and Primary Care. 2022 Nov 1;11(11):6929–34.

Citation

This is an open access article distributed under the  Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.