ADVANCING DIABETES PREDICTION THROUGH MACHINE LEARNING AND DEEP LEARNING MODELS USING PIMA INDIAN AND CLINICAL-BIOLOGICAL DATA

Zeeshan Hussain; Suraiya Parveen; Ashif Khan; Ihtiram Raza; Umnah

doi:10.70102/afts.2025.1833.760

Original scientific article

Published: December 2025

<< Prev | Next >>

PDF

https://doi.org/10.70102/afts.2025.1833.760

ADVANCING DIABETES PREDICTION THROUGH MACHINE LEARNING AND DEEP LEARNING MODELS USING PIMA INDIAN AND CLINICAL-BIOLOGICAL DATA

Abstract

Diabetes Mellitus is a significant world health and early detection is of paramount significance since it decreases the complications and enables medical intervention in time. The paper is a comparison between the predictive accuracy of the eight Machine Learning classifiers: Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, Gradient Boosting, Naive Bayes, k-Nearest Neighbors (k-NN), and an Ensemble model on the Pima Indian Diabetes dataset and a collection of clinical-biological patient records. Performance evaluation was conducted using Precision, Recall, F1-Score, and the Area Under the ROC Curve (AUC-ROC). The findings show that a significant difference was observed among the models, with SVM (AUC-ROC: 0.8648) and the Logistic Regression (AUC-ROC: 0.8638) having the best discriminative ability. A comparable study found that Logistic Regression had the highest Precision (0.7632), indicating fewer false-positive predictions, whereas Decision Tree had the highest Recall (0.7447), indicating greater sensitivity in detecting diabetes cases. The ensemble learning produced the best overall performance (AUC-ROC: 0.8709), suggesting that combining predictions from multiple models increases reliability and generalization. On the other hand, k-NN performed worst due to sensitivity to noise and the number of features. In general, the results provide evidence of the high potential of linear-margin and ensemble-based models to structured clinical data and would be a robust foundation of clinical decision support systems, which further help to broaden the role of ML-based analytics in early diabetes diagnosis and preventive health care planning.

Keywords:

diabetes prediction,

machine learning; pima indian dataset,

clinical-biological data,

ensemble learning,

logistic regression,

support vector machine (SVM),

AUC-ROC,

clinical decision support system.

References

Taskinen MR. Diabetic dyslipidaemia: from basic research to clinical practice. Diabetologia. 2003 Jun;46(6):733–49.

Saratha B, Radhika MS, Priya VS. An Approach Towards Diabetic Retinopathy Detection and Analysis Through Cognitive Computing. Archives for Technical Sciences. 2025 J1(33): 125–134.

Ganie AH et al. Robust diabetic prediction using ensemble machine learning techniques with SMOTE. Scientific Reports. 2023.

Vij P, Prashant PM. Predicting aquatic ecosystem health using machine learning algorithms. International Journal of Aquatic Research and Environmental Studies. 2024;4(S1):39–44.

Ganie S, Malik MB, Arif T. Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches. Journal of Diabetes & Metabolic Disorders. 2022 Jun;21(1):339–52.

Citation

Copyright

This is an open access article distributed under the Creative Commons Attribution Non-Commercial License (CC BY-NC) License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Article metrics

Google scholar: See link

Issue 33, 2025

A NOVEL FRAMEWORK FOR ENHANCING DATA COLLECTION MACRO- STRATEGIES IN HETEROGENEOUS IOT NETWORKS USING ADVANCED MATHEMATICAL MODELING GA-PSO-MIN: A HYBRID HEURISTIC ALGORITHM FOR MULTI-OBJECTIVE JOB SCHEDULING IN CLOUD COMPUTING HOMOGENEITY URBAN CELLULAR AUTOMATA MODEL – FROM REGENERATIVE TO SUSTAINABLE CITIES IOT POWERED SMART CRADLE FOR INFANT CARE AND VACCINATION MONITORING SYSTEM ENVIRONMENTAL ANALYSIS OF A LOW-COST SOLAR STOVE USING RECYCLED MATERIALS: A CLEAN ENERGY INNOVATION FOR HOT ARID REGIONS See full issue

About us

Editorial policy

ADVANCING DIABETES PREDICTION THROUGH MACHINE LEARNING AND DEEP LEARNING MODELS USING PIMA INDIAN AND CLINICAL-BIOLOGICAL DATA

Abstract

Keywords:

References

Citation

Copyright

Article metrics

Issue 33, 2025

Citations

Disclaimer