Blog
- Apr 9, 2020
- 3 min read

How do you use data to decide?

Updated: Jun 16, 2020

Imagine that you’re a doctor, treating a patient in hospital who has all the symptoms of COVID-19. You test the patient using a swab for the virus. The test comes back as negative, you have two options one believe the test and take off your protective gear (that’s causing you to sweat profusely) or don’t believe the test and keep your protective gear on and carry on sweating.

What would you do?

Most people would probably take the sweating option, keeping the gear on and treating the patient as though they have the virus. Especially, when you know that the test for COVID-19 can produce false negatives (that a test says no when actually it should have been a yes). When the nurse then tells you that other family members of the patient had tested positive, you will most likely be very certain that this test is a false negative. Plus, by erring on the side of caution also has the added benefits of protecting those around you.

All experimental data involves chance we use sample sizes to try and reduce the effect that a random fluke could have on a statistic. This works well for clinical trial results (at the end of the trial) when there are lots of patients. However, when we want to monitor how the trial is progressing, by looking at key indicators and comparing sites, what steps can we take to make sure that we aren’t endangering the study by not intervening or that we are intervening too much?

Here at TRI we give options to use different confidence intervals that can account for small numbers of events and small data samples. This reduces the probability of having a false positive, we can then switch to confidence intervals that can reduce the chance of producing a false negative at a later stage when there is a larger data sample.

Using tables and heat-maps people can see where this data is coming from, to help give them context, this proved so useful in rejecting the false negative test at the start of the piece. For example there was a trial on eczema and all regions that were going into winter reported an increase in eczema symptoms then we could understand that it’s not a site issue, but most likely a climatic change that is causing this effect among all the sites. By using filters, within the graphs and tables we can test these hypothesis and see how these data are grouped knowing what we know about the disease.

Thirdly, when viewing outcomes that are linked to the type of patient that is at the site and not just how the site is preforming we can adjust for variables that we know are likely to cause the event. Let’s break that down. In a study we might have lots of sites, some of these sites might be in the centre of a city where lots of young people live, there might be other sites that have an older population. We would assume that there will be more adverse events in an older population than a younger population. By testing this and then using adjusting for this in the model we can see that sites we thought were higher than they should be actually weren’t, once we have accounted for the age within the sites. This then enables the viewer to focus only on the sites show a true difference once we account for age. Reducing the burden of work and increasing the quality at the same time. Win Win!

Just like you did earlier when you are given the right context you can easily weigh up information and decide what the best cause of action is.

How do you use data to decide?

Recent Posts