Perils of data modelling: The Coronavirus Edition

Perils of data modelling: The Coronavirus Edition

Oxford coronavirus model falls prey to the pitfalls of modelling in the absence of adequate raw data

“All models are wrong. But some are useful.”

Analysing various outbreak scenarios from rising Coronavirus deaths in the UK and Italy, a recent model released from Oxford University estimated that as of last week, almost 68% of Britain’s population – nearly 45 million – had been exposed to the coronavirus. This brought about a flurry of headlines in the UK, leading to widespread panic and paranoia among the citizens. The study, however, was flawed.

In the absence of hard data at the time of modelling, the model ran on several assumptions – resulting in a mere demonstration of an extreme-case scenario. In contrast, another simulation showed that only a tiny proportion had been exposed. The real number, of course, lies somewhere between these two extremes. Given a lack of backward-testing options, it is not yet possible to predict exactly what fraction of the public has been exposed to the risk of severe illness, and what the rate of spread is going to be. Currently, it is simply an approximation based on an assumption. The consensus about the Oxford model was that it was simply a hypothesis – one of the many possible ones – regarding the spread of the pandemic.

The model had its positives too. It emphasised the importance of serological testing for antibodies against the virus instead of a regular nasal or throat swab – something that has since been adopted by several healthcare professionals in the US, Europe as well as Asia – leading to improved diagnostic process for coronavirus testing.

Whilst not accepting all conclusions from models can be prudent, it is not advisable to completely disregard them either. As with most other sources of information, they need to be triangulated and used in congruence with (if possible) several other data sources – especially when something as significant as public policy depends on the inputs. If the flaws of modelling are taken into account and its shortcomings appreciated, then even extreme-case assumption-based models can prove to be very useful.

Leave a comment

Your email address will not be published. Required fields are marked *

© 2023 Praxis. All rights reserved. | Privacy Policy
   Contact Us
Praxis Tech School
PGP in Data Science