Predictive Solutions Series – Readmissions ML Module

Hospital readmission rates represent an important proxy measure for adverse-quality inpatient and outpatient care and adverse care transitions (Axon 2011). Furthermore, as standated by the Patient Protection and Affordable Care Act and related legislation, CMS reduces payments to hospitals with excess readmissions (42 CFR part 412). For these reasons, hospital systems do significant work to report and analyse readmissions within their system.

D&D’s bespoke readmissions module allows trusts to be future centric and forward facing. This is achieved through our in-house methods of machine learning and predictive analytics,with the aim of being able to spot and readmitted patient and allocate capacity accordingly.

Our methods allow for insights that are actionable and enabling earlier interventions to take place.

Our ML algorithm technicals

Our Draper & Dash custom readmissions predictor uses a complex set of predictive algorithms to predict whether a patient will return in 30 days or less after being discharged.

This is a very complex prediction, using numerous variables, for many different types and characteristics of patients. We have created a customised ensemble ML algorithm to combine the accuracy of many models. This type of model has the advantage of combing the accuracy to boost accuracy, sensitivity and specificity.  To understand more about these metrics refer to our other blog post on how to interpret confusion matrices.

Our ensemble ML models

The models used to answer this ML problem, to predict the probability of readmission are Neural Networks (ANN) – implemented from the caret package. Additionally, we use Naive Bayes from the e1071 package and K-Means Clustering from the cluster package. For the prediction of the number of days from discharge we also use a Random Forest (RF) model – implemented from the caret package.

But, why do we use these models, and why in an ensemble fashion?

To answer that:

  • The models mentioned are very different approached towards making a prediction, and this difference is very useful in obtaining a balanced, and trustworthy output.
  • ANN’s are inspired from the study of how our brains work. These systems learn from input-output datasets, by generating layers of artificial (and interconnected) neurons, that can transmit signals to other neurons in different layers. These artificial neurons send numerical weighted values to each other, reaching in the end the last layer, which generates an output. In our model, the number of hidden layers is left for the NNET to decide depending on the amount of data available.
  • Naïve Bayes is a simple and fast probabilistic classifier. It assumes that all variables are independent from each other, and, using a simple statistical approach, obtains the least conservative of the three outputs. It performs less well on specificity, but it tends to show a great performance on sensitivity.
  • The K-Means clustering algorithm uses a novel approach, as it was specifically designed to solve another problem pertaining the ability to predict if someone will become stranded, or super stranded. After bucketing the variables (, we use this model to find “groups of neighbours”, that is, patients whose data is distributed in a similar fashion to each other. Once we’ve obtained the different groups (in our case, we distribute the data in around 800 different groups), we calculate the mean value of the Stranded bit for historical data, multiply it by a 100, and use this number as the probability of a patient within this group of being stranded. When a new patient checks in the hospital, we simply find out what group that patient belongs to and assign a probability of being stranded based on the historic data of patients within the same group.

Making predictions with our custom D&D model

Once we have run the prediction with these 3 models, using different contributions from each of them, we combine their outputs, thus generating a more balanced, accurate, and strong prediction.

Using the method explained in a previous blog post (, we generate the list of most important variables that led to this prediction.

After we have the prediction, we run, on the cohort of patients predicted as readmitted, the RF model in order to obtain the likelihood of said patients being readmitted in different time windows, expanding the information given to the trust, in order to act accordingly.

With this customised system, we can obtain predictions that are trustworthy, tested, and that give useful information to enable the rapid assessment of who will and will not be likely to be readmitted. This allows the services and trusts to be proactive in their approach to demand and capacity allocation.

Want to find out more?

If you are interested in identifying patients at the risk of readmission sooner and in a more timely fashion, then act now and arrange a demo with our support team. Enquires to

Alfonso Portabales – Data Scientist and Gary Hutson – Head of AI