Embedding randomisation in day-to-day practice

A/B testing is the business world's randomisation and medicine has barely even heard of it

May 21, 2023

In business, an A/B test is a randomisation between two products. It is a particularly easy thing to do in tech - randomise website visitors to seeing either a blue theme or a red theme and seeing which one gets the most clicks, attention, or sales. It can be as simple as the font, the size of the font, the size of the gaps between the font. Speak to a Google engineer and read the Lean Startup by Eric Ries and you will see how important A/B testing is. How much can a font size really matter? Well, it really can make a difference to sales or other outcomes and for companies like Google, they can answer the question breathtakingly quickly given their sample size - millions of users every day. They can then optimise in an iterative way undertaking a rapid series of A/B tests that results in the forced evolution of a webpage towards the optimal product.

Despite it’s incredible power, A/B testing is barely even talked about in medicine. Instead we have “randomisation.” Randomisation is a beautifully simple, yet effective concept, that gives us the answers to questions that are achievable in no other way. It is almost always the only way to know whether treatment A is better than B. In medicine, these tests take the form of randomised controlled trials, clunky, bureaucratic beasts that cost the GDP of a small African country and often take years to complete.

The case for A/B testing in medicine

Medicine is full of similar tiny A/B decisions that we often don’t even think about. However, I believe that addressing some of these could make a massive difference to patient care. I have previously argued that we need to make randomised trials much easier but the regulatory requirements are prohibitive. If you want to randomise a patient to anything in the NHS, it counts as research and you need to go through an arduous, time-consuming and downright soul destroying NHS ethics process. Although this can protect patients from unethical research, it’s a one-size-fits-all bureaucracy that often stifles enthusiasm and prevents progress. Take a well known intervention for inpatients at risk of nutritional insufficiency - serving those patients’ food on red trays - a process that is supposed to signal to staff that the patient has extra requirements. Was this tested in a randomised manner before it was introduced? Was it hell. No, it was based on a 30 patient audit and combined with more paperwork - a risk assessment. This is an intervention that has been adopted across the NHS - and it is based on a 30 patient audit. Madness! So, you tell me what’s more ethical, just changing something because you hope or assume it will work, or randomising patients (perhaps even without telling them) to a red tray or a normal tray, or a blue tray, or a Joseph and the Amazing Technicolor tray… and using their routine clinical records to examine which approach is better. This is more akin to an A/B test than an RCT is it not?

Want to change something? Great! Crack on. Want to study the effects of that change? Sorry, no that’s research.

Our current system encourages action without randomisation. It encourages leaders to simply implement interventions based on assumption, not on data because doing the latter is research - and this generates a lot of headaches. Furthermore, it is arrogant. Rarely do people stop and think: “What could be the unintended consequences of this extra form?”

Another great example of this, as alluded to earlier is risk assessments. The Waterlow score very effectively identifies patients at risk of pressure damage and is administered ubiquitously in the NHS. However, despite widespread adoption, when it was finally tested in a randomised manner against nurse’s judgement, it made no difference. Has this trial changed practice - nope! I despair!

Healthcare needs a serious rethink of how we conduct change. Progress needs to be embedded in clinical care and not just thought of as research. We need to rethink what we consider as ethical based on reason not on ideology and be able to perform A/B tests without clunky trials machinery. Of course there needs to be some oversight but the current system is not fit for purpose. The artificial intelligence revolution is providing us with ways by which we can use routinely gathered clinical data for our study outcomes without dedicated prospective collection or manually trawling notes both of which are time-consuming and expensive. Natural language processing has the ability to process free-text data and code it into quantitative measures - it can for example look for an outcome of pressure ulcers by just processing the written clinical notes.

We can build randomisation into day-to-day practice

Imagine how much progress we could make if we are able to build all this randomisation into clinical practice so that we don’t even have to think about it on a day-to-day basis. For many of us, when we are busy on the ward or in clinic, research is the last thing on our minds - and is often seen as extra, low-priority work. But, imagine incorporating A/B testing into a clinical care system such as electronic prescribing. You’re an acute medic wondering what the difference is between the low molecular weight heparins tinzaparin and dalteparin. No-one knows and it’s assume that they are equally effective. You type in dalteparin, the system asks you to randomise to one or the other, you click “yes” and the patient enters the A/B test. The clinical system collects data like recurrent thrombosis, bleeding, mortality (perhaps linked in with GP records) and life is easy. The system could even automate an email to the patient telling them what’s going on, and asking them to complete some quality of life information. Sure, the response rate might be low, but it will be balanced across the two groups, and the lean nature of the A/B test means that it can be performed at scale, offering the ability to pick-up even small safety or benefit signals. Better still, there is no real limit to the amount of tests that can be going on at once - a single hospital or GP practice could have hundreds of A/B tests that the clinicians barely even need to think about. There are no trials paperwork, or clunky consent procedure. Is this ethical? I believe it is. When the A/B interventions of interest are two that would be considered standard practice and there is genuine equipoise, it is totally ethical to do this. The patient should of course be informed, but a full-on trials consent is not needed. In some A/B tests, does the patient even need to know? I don’t think we should be hiding things but does Doris really need to know why her food is on a red tray and Martha’s is on a blue tray? Do we explain this to patients anyway? I doubt it. Of course, it can be explained if they ask.

Imagine…

Imagine an NHS, or any healthcare system for that matter, with all this embedded. Imagine the output, the progress, the efficiency. We could answer all sorts of questions that could enable us to slowly optimise care with A/B testing and evidence generation at its heart rather than conjecture. We can use this approach to answer questions about inpatient observation frequency, ward lighting levels, clinic waiting room layouts, the font of appointment letters, generic drug A vs generic drug B, haemoglobin transfusion thresholds of 71 g/L vs 72 g/L, and many else besides.

The technology is here, we just need to rethink what we call randomisation. I would argue for more than one tier of randomisation: simple A/B testing should not need formal ethical approval.

Edward Hibbert

Aug 21, 2023

A/B testing is what people talk about, but in the IT world there are even better variants. See http://stevehanov.ca/blog/index.php?id=132 for an explanation of "bandit testing", which is a more adaptive approach which works better when the intermediate outcomes are important.

Expand full comment

2 replies by Richard Buka and others

2 more comments...

The Classical Compass

Discussion about this post