A MACHINE LEARNING MODEL FOR PREDICTING PRE ECLAMPSIA FOR RESOURCE-CONSTRAINED REGIONS.
Abstract
Pre-eclampsia is globally recognized by the World Health Organization as a significant
contributor to high rates of morbidity and mortality among infants and mothers. It
accounts for approximately 3% to 5% of all reported pregnancy-related complications
worldwide. However, in developing nations like Kenya, particularly in the sub-Saharan
region, the prevalence of pre-eclampsia is notably higher, ranging from 5.6% to 6.5% of
reported pregnancies. Key risk factors associated with pre-eclampsia include sudden
elevation in blood pressure, increased protein levels in urine, chronic kidney disease,
and the presence of either Type 1 or Type 2 diabetes. This research developed a
predictive model for pre-eclampsia utilizing supervised machine learning techniques on
socio-demographic data gathered from Kilifi County. A total of 500 secondary data
records gathered from Kilifi county hospital were pre-proposed to train and test the
machine learning models. To train and test the models, the study employed five
supervised machine learning algorithms, namely Logistic Regression, Random Forest,
Naïve Bayes, Linear Discriminant Analysis, and Support Vector Machines. Maternal
age, marital status, gravida, education level, and ANC attendance were identified as the
optimal extracted features using PCA. The logistic regression model outperformed
other supervised machine learning models in the study, achieving a high accuracy rate
of 0.96 in predicting pre-eclampsia. The results show that Logistic Regression can
accurately predict pre-eclampsia within the first trimester of pregnancy. Future research
will involve collecting more data from different regions to improve performance and
building a mobile application that will improve MCH accessibility in resource-constrained regions in the country