Deteksi Risiko Diabetes Pada Wanita Hamil Menggunakan Algoritma Random Forest
Studi Kasus: Pima Indian Dataset
DOI:
https://doi.org/10.61132/prosemnasproit.v2i2.62Keywords:
Decision Support System, Feature Engineering, Gestational Diabetes Mellitus, Random Forest, Risk PredictionAbstract
Gestational Diabetes Mellitus (GDM) is a pregnancy-related metabolic disorder that poses health risks to both mother and fetus if not detected early, requiring accurate prediction methods for early screening and clinical decision-making. This study applies the Random Forest algorithm to detect GDM risk using clinical data from the Pima Indian Dataset. Data preprocessing included handling missing values, standardization, feature engineering, and a 70:30 train–test split. Two models were developed: a baseline and an optimized model using GridSearchCV hyperparameter tuning, validated with 5-fold cross-validation. Performance was assessed using a classification report, confusion matrix, and ROC–AUC. Results show that the optimized model outperforms the baseline, achieving 88% accuracy, an AUC of 93%, and average recall of 81%–85%. Compared to previous studies, this approach demonstrates improved predictive performance. The findings indicate that combining Random Forest with comprehensive preprocessing, feature engineering, and model optimization is effective and feasible for developing a medical decision support system for early GDM risk screening.
References
Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00444-8
H, W., N, L., T, C., M, W., H, S., L, Y., & X, Y. (2022). IDF diabetes atlas: Estimation of global and regional gestational diabetes mellitus prevalence for 2021 by International Association of Diabetes in Pregnancy Study Group’s criteria. Diabetes Research and Clinical Practice, 183.
International Diabetes Federation. (2024). IDF Diabetes Atlas. IDF.
Joseph, V. R., & Vakayil, A. (2022). SPlit: An Optimal Method for Data Splitting. Technometrics, 64(2), 166–176. https://doi.org/10.1080/00401706.2021.1921037
Kaya, Y., Bütün, Z., Çelik, Ö., Salik, E. A., Tahta, T., & Yavuz, A. A. (2024). The early prediction of gestational diabetes mellitus by machine learning models. BMC Pregnancy and Childbirth, 24(1). https://doi.org/10.1186/s12884-024-06783-7
Mantri, N., Goel, A. D., Patel, M., Baskaran, P., Dutta, G., Gupta, M. K., Yadav, V., Mittal, M., Shekhar, S., & Bhardwaj, P. (2024). National and regional prevalence of gestational diabetes mellitus in India: a systematic review and Meta-analysis. BMC Public Health, 24(1). https://doi.org/10.1186/s12889-024-18024-9
Mori, R., & Pandey, A. (2022). Global burden of early pregnancy gestational diabetes mellitus (eGDM): prevalence, risk factors and outcomes. Acta Diabetologica, 59(4), 453–462. https://pubmed.ncbi.nlm.nih.gov/34743219/
Nassiwa, F., & Zeng, J. (n.d.). Evaluating Traditional Machine Learning Models for Predicting Diabetes Onset Using the Pima Indians Dataset. https://ssrn.com/abstract=4878052
Pham, H. H., Nguyen, H. Q., Nguyen, H. T., Le, L. T., & Lam, K. (2023). Evaluating the impact of an explainable machine learning system on the interobserver agreement in chest radiograph interpretation. http://arxiv.org/abs/2304.01220
Reddy, A. A., & Kumar, P. (2023). Feature selection and feature engineering strategies for diabetes prediction. Journal of Biomedical Informatics.
UCI Machine learning. (2021). Pima Indians Diabetes Database. Kaggle. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database?utm_source=chatgpt.com
Wang, W. (2024). Principles of Machine Learning: The Three Perspectives (Springer Nature).
World Health Organization. (2023). Diabetes fact sheet. WHO. https://www.who.int/news-room/fact-sheets/detail/diabetes
Xu, Y. (2024). Random Forest-based clinical decision support for gestational diabetes prediction and feature interpretation. IEEE Access, 12.
Zhang, Z., Yang, L., Han, W., Wu, Y., Zhang, L., Gao, C., Jiang, K., Liu, Y., & Wu, H. (2022). Machine Learning Prediction Models for Gestational Diabetes Mellitus: Meta-analysis. Journal of Medical Internet Research, 24(3). https://doi.org/10.2196/26634
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Prosiding Seminar Nasional Ilmu Teknik

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.





