Predicting emergency department hospital admissions with machine learning

4 min readJun 15, 2021

A short article review

Introduction

As you might know from your own experience, or from the experiences of the people around you, medical assessments in the emergency department can take a painfully long time. This post is a short review of the paper Machine learning for developing a prediction model of hospital admission of emergency department patients: Hype or hope? scheduled for publication in August 2021 in the International Journal of Medical Informatics. The authors used logistic regression, random forest trees, XGBoost gradient boosted decision trees and deep neural networks to predict whether patients will be hospitalized after they visited the emergency department. A quicker decision to admit a patient can lead to higher patient satisfaction, and according to the authors, could help junior doctors outside working hours and support decision making in clinical assessment.

The research question

Can machine learning be used to predict whether an emergency department patient will be admitted to the hospital?

Method

Data acquisition
The study used the NEED (Netherlands Emergency Department Evaluation Database) dataset. NEED is a non-profit institute that collects the data of emergency departments in the Netherlands. They currently collect data from 12 emergency departments. The authors of this paper used the Minimal Data Set for their study, which consists of data from 3 hospitals, and contains a register of 172,104 emergency department patients in total.

Data set
A long list of independent variables (e.g. referral type, complaints, disease severity based on vital signs, etc.) was assessed 15 min after arrival, 30 min after arrival and 2 h after arrival. Because more time in the emergency department leads to more data, there is a change in the amount of available data between the different timeframes.

Splitting of test and training data set
The data was split into a training dataset (2/3 of the data) and test dataset (1/3 of the data). A ‘leave one group out” cross-validation was conducted in which each of the individual emergency department was left out to compensate for the differences in characteristics between the departments.

Hyperparameters
“Hyperparameters are the variables which determines the network structure(Eg: Number of Hidden Units) and the variables which determine how the network is trained(Eg: Learning Rate).” Hyperparameters were configured through training on the cross-validated data. The models were then trained on the full training set with the tuned hyperparameters.

Testing
Models were tested on the data of each individual emergency department from the original data. An AUC curve was used to estimate accuracy. A method commonly used in meta-analysis to fuse the outcome of different studies (random-effects model) was used to get an “average” AUC for the different hospitals.

Potential mean reduction time
To calculate how much time could be saved, the models were configured with a 5% error rate in prediction. This was done by letting the model decide to admit a patient in the case that:
1) The outcome of the model for a patient was higher than the threshold that was related to a 95% positive predictive value or
2) The outcome of the model for a patient was lower than the threshold that was related to a 95% negative predictive value.

Software
Analysis was done with Python 3.8.0 with R plugins. I asked the authors for the Python code which I will analyse in further detail if they are okay with sharing their code.

Results

Below are the results. The calculated pooled AUC for the different models is similar. They range between 0.80 for the random forest during triage (t=0) and 0.86 for the random forest, XGBoost and deep neural network 30 min after presentation at the emergency department. Interestingly, the 1.5 hours of extra information about a patient (timeframe of 2 hours), the time in which blood samples are analysed and lab results are published in the patient file, does not provide a higher prediction accuracy in the machine learning models.

Furthermore, the authors calculated a theoretical reduction in waiting time of 33 min (25%) in the total population of patients with the help of the machine learning models.

Conclusion

Machine learning can help in predicting whether a patient will need to be admitted, potentially improving patient communication about admission and decreasing the length of stay in the emergency department. The big question is however whether the decision to admit is the largest time-slowing factor. As we know from chemistry, the rate-determining step is the slowest step in a reaction mechanism. If a machine learning model can decide within 15 minutes or shorter if somebody will be admitted, this does not per se also lead to a short turnover. Working in the emergency department will show you many factors that influence the length of stay and that a machine learning system does not have control over such as administration tasks, medical treatments before discharge, the number of available hospital beds, the waiting time needed for consultation by a busy consulting medical specialist, emergency department crowdedness, and most importantly, the number of available doctors and nurses to assess, treat and talk to a patient. Another just as relevant issue is that the admission decision by a physician can not be considered the golden standard. It is the eventual diagnosis of a patient and the severity of complications because of that diagnosis that makes an admission justified. These factors can only be assessed after the emergence department visit, in some cases only months later, which makes an assessment of the predicted need for hospital admission based on a true golden standard (requirement of in- or out-hospital treatment) extremely complicated.