Case Fatality Ratio Estimates for the 2013-2016 West African Ebola Epidemic: Application of Boosted Regression Trees for Imputation
The 2013-2016 West African Ebola epidemic has been the largest to date with more than 11,000 deaths in the affected countries. The data collected have provided more insight than ever before into the case fatality ratio (CFR) and how it varies with age and other characteristics. However, the accuracy and precision of naïve CFR remain limited because 44% of survival outcomes were unreported. Methods: Using a Boosted Regression Tree (BRT) model, we imputed survival outcomes (i.e. recovery or death) when unreported to improve estimates of CFR. The method allowed us to further identify and explore the relevant predictors, clinical and demographic of CFR. Findings: The average out-of-sample performances of our model were good: sensitivity=64·3 (95% CI 53·8-73·0), specificity=64·3 (95% CI 53·8-73·0), percentage correctly classified=64·2 (95% CI 54·4-73·1) and area under the ROC curve=70·2 (95% CI 59·3-78·1). CFR estimates obtained with imputation for the 2013-2016 West African epidemic were 66·5% (95% CI 61·8%-71·1%) overall and 68·9% (95% CI 62·1%-74·5%), 65·7% (95% CI 61·4%-69·5%) and 61·4% (95% CI 55·9%-67·3%) for Sierra Leone, Guinea, and Liberia, respectively. We found that age, reporting-delay, country, and fever explained 56.7% of the variance in observed CFR. Interpretation: We achieved appreciable out-of-sample performance and CFR estimates with imputation were an improvement to CFR estimates obtained without imputation. This robust baseline CFR estimates will inform public health contingency planning for future Ebola epidemic, and help better allocate resources and evaluate the effectiveness of future inventions. Funding Information: Commonwealth Scholarship Commission, UK MRC Centre funding, National Institute of Health Research, and Imperial College Junior Research Fellowship. Declaration of Interest: We declare no competing interests.