Associate Professor, BME Wayne State University Detroit, Michigan, United States
Introduction: : Racial disparities in pregnancy-related outcomes remain a critical public health crisis in the United States. Numerous studies have shown that Black mothers consistently experience worse birth outcomes than White mothers. The infant mortality rate for Black mothers stands at 10.97 deaths per 1,000 live births, nearly twice the national average of 5.79 [1]. Among the many contributing factors, disparities in delivery methods, particularly Cesarean sections (C-sections), are a focal point of concern. C-sections, while often medically necessary, are associated with greater maternal morbidity, longer recovery times, and increased future pregnancy risks. Concerningly, Black women are nearly 25% more likely than White women to undergo unnecessary C-sections, even when adjusting for clinical and socioeconomic factors [2]. Geographic differences contribute to these disparities, as state‐level infant mortality rates range from 3.66 per 1,000 in Massachusetts to 8.73 per 1,000 in Mississippi [1]. This introduces another layer of complexity as such disparities reflect broader issues of structural racism, provider biases, and state-level policies. Despite the abundant amount of natality data from the U.S. that we have access to, many data science techniques have not yet been applied to obtain a deeper understanding of this crisis. The objective of this study is to create a deeper exploration of the extent and drivers of racial disparities in delivery outcomes.
Materials and
Methods: : This study uses data from the CDC’s WONDER Natality database (2016–2023). The dataset obtained was aggregated by state of residence, mother’s single race 6, source of payment for delivery, and delivery method expanded. For each row, the dataset included averages for maternal age, pre-pregnancy BMI, prenatal visit count, birth weight, gestational age (OE and LMP), and interval since last live birth, along with total births. Data Processing: A cleaned Excel file was used in RStudio, where a binary variable (cs) was created to identify Cesarean section deliveries. Births were then grouped into counts of C-sections (successes) and non-C-sections (failures) across combinations of race, insurance type, and state cluster. Clustering: In order to explore differences across states, k-means clustering were used to group states into four clusters based on their overall C-section rates. These clusters served as categorical predictors in later models, helping capture regional differences in healthcare practice and policy. Modeling: A logistic regression model was used to estimate the likelihood of Cesarean delivery based on different variables. To support and visualize the findings, a random forest classifier to evaluate prediction accuracy, and a decision tree to show specific thresholds were used.
Results, Conclusions, and Discussions:: State-Level Differences in Cesarean Rates: The state cluster map (Figure 1) shows that the highest-rate cluster (Cluster 4) includes many states in the Southeast and Mid-Atlantic, which have historically shown higher maternal health risks. In the logistic regression model, cluster membership was a significant predictor of C-section delivery, even after adjusting for maternal age, BMI, prenatal visits, payment type, and race. Key Predictors of Cesarean Delivery: The Random Forest model achieved an accuracy of 84.6%, showing strong predictive ability. The variable importance showed that BMI was the most important predictor of C-section, followed by maternal age, prenatal visits, and payment type. Race, while statistically significant, contributed less in predictive models compared to clinical factors. The decision tree (Figure 2) helped visualize these interactions. For instance, women with BMI above 29 and fewer than 11 prenatal visits had higher C-section risks. Race appeared in one key split, showing that Black women with high BMI and certain payment types were more likely to have a C-section, supporting earlier literature on disparities in delivery outcomes.
Conclusion: This study supports existing evidence that clinical factors play a central role in determining delivery method, but also highlights that geographic and systemic factors are strong predictors even after adjusting for patient characteristics. Although race alone was not always a significant predictor in regression models, it likely acts through indirect pathways, including access to healthcare, regional policies, and provider-level biases. These factors are partly captured by the state-level cluster variable, which significantly impacts delivery method probability.
Acknowledgements and/or References (Optional): : [1] Cox, C., Amin, K., Kamal, R., Claxton, G., & Levitt, L. (2020, March 20). How does infant mortality in the U.S. compare to other countries? KFF - Peterson Center on Healthcare. [2] PBS NewsHour. (2020, July 27). Why Black women are more likely to get unnecessary C-sections, risking complications.