Late 2019, we had zero data about COVID-19. Gradually we learned of its appearance as a coronavirus through reports of cases from China and subsequently we obtained increasing volumes of data from around the world.
The immediate questions we wanted to answer were about how the virus might spread throughout a population. We wanted to forecast how many cases to expect over time and to predict what might happen when the government employed specific strategies to control the spread of the virus.
Models can inform decision-making around COVID-19 for many purposes beyond the dynamics of the epidemic itself. For example, socio-behavioural models predict how people may behave in relation to intervention strategies and economic modelling describes the economic consequences of the epidemic. While the principles remain the same, we refer here to models that describe the transmission of the virus.
These pages of this site explain the different methods of modeling: statistical estimation and spatial and spatio-temporal modeling
This blog provides an introduction to modeling for COVID-19
This commentary by Eubank et al discusses the context and significance of models in contributing to the analysis of COVID-19.
Types of model
We use models to forecast and predict the future. The terms ‘forecast’ and ‘predict’ are often used inter-changeably. But in modelling, forecast usually refers to what could happen without intervention ie how the epidemic would progress without any attempt to control it, whereas predict refers to what could happen if some intervention/s were to be implemented.
Epidemiologists draw on a range of methods depending on the specific questions they are trying to answer and the amount of data that are available. Broadly models can be classified as:
Mechanistic (or mathematical) models
Mechanistic models predict outcomes on the basis of a theory related to the process at hand. Modellers set up a theoretical framework that represents and quantifies the causal pathways and mechanisms linking determinants and health outcomes, using available data.
For COVID-19, most modellers use the traditional epidemic modelling approach known as SEIR. They define states through which members of the population may transition. They then use differential equations to model the rate at which people may transition between states and hence estimate the population number in each state as time progresses.
Susceptible ⇒ Exposed ⇒ Infected ⇒ Removed
This site demonstrates how such a model works. It adds some additional states, for example they sub-divide the state removed into recovered, hospitalised and death. This is a simple calculator but it is helpful in understanding how a model like this works.
Empirical (or statistical) models
Empirical models do not set up any theory but instead predict outcomes by modelling observed data on the outcome itself and any correlated variables. Such models usually adopt some form of regression analysis. The functional form of the relationship between the outcome and other variables is not important so long as it accurately predicts the outcome over time with associated confidence intervals. Such models can be updated as more data become available to improve the prediction.
Complex simulation models
These models combine the principles of mechanistic and statistical models but introduce more complexity by simulating large scale situations. One example is the Global Epidemic and Mobility Model (GLEAM) which studies the spatiotemporal spread of COVID-19.
Weather forecasting has long used the ensemble approach to improve their forecasts. Modellers synthesize the forecasts of models that simulate different scenarios. The COVID Forecast Hub Team is using this approach by collecting standardizing, visualizing and synthesizing forecast data published by over twenty teams forecasting COVID-19 deaths in US states.
All models comprise:
- A number of assumptions, for example about the dynamics of disease transmission or people’s compliance to specific interventions.
- A theory that is translated into mathematical calculations, statistical estimation and/or a computer simulation.
- Input data which may be raw data such as observed deaths so far, or imputed data about for example the rate of transmission of the disease.
- Estimates of output data such as the number of deaths with some indication of uncertainty, usually in the form of confidence intervals
Reporting and using model results
The output data for COVID-19 models are primarily estimates of the number of cases and the number of deaths from COVID-19. Some models also explore the number of health system resources required, for example the number of intensive care beds. Prediction models also estimate the reproductive number (R).
The results of COVID models usually appear as online dashboards that are updated day-by-day. Model output is usually in the form of a graph of the predicted number (or accumulated number) of cases or deaths against time since the start of the epidemic. Since the figures are estimates, they will be accompanied by margins of uncertainty. As the epidemic progresses, the models show the predicted number of cases/deaths as an extension of the actual number of reported cases/deaths. Most models are constantly updated as new data become available. Hence the predictions are not static.
Users should look for accompanying information that explains each of the four components we described above and warns of any limitations of the methods. This information may come in the form of FAQs but there should also be an accompanying scientific paper describing the methodology.
Naturally, different models produce different results depending on their assumptions, input data and methods. Modelers always assess their model predictions against reality and publish their estimates with margins of error. Their results are intended to be indicative within the limitations of their models.
Questions about models
“All models are wrong, but some models are helpful, and I think it’s important to remember that.” Betz Halloran, an infectious disease modeler at the Fred Hutchinson Cancer Research Center.
Weather forecasters have taken decades to fine tune their models to the extent that we believe their forecasts. Modelling for COVID-19 is not there yet. So it is important ask questions such as:
- Where are the methods published?
- What are the assumptions?
- What are the data sources?
- What is the margin of error around the predictions?
- What are the limitations of the model?
Some critiques of popular models include:
Main international models in use
Many models are currently available. We identify two widely publicized models that include estimates for several countries.
There are many country specific models, for example the US Center for Disease Control and Prevention has accumulated the models that forecast deaths in the US and synthesized their results on this page.
Models that forecast or predict for sub-regions of a country are useful to help policymakers target interventions. This makes sense for large areas, for example by US state, but there may not be sufficient data to make reliable forecasts or predictions for small areas.
Imperial college has developed four planning tools based on different types of model to: 1) make short-term forecasts of healthcare demand for countries in the early stage of COVID-19 epidemics; 2) provide daily updated estimates of the number of infections and the impact of non-pharmaceutical interventions for the COVID-19 epidemics in several European countries; 3) to estimate the hospital resources required ; and 4) make short-term forecasts of COVID-19 deaths in multiple countries.
They explain their methods in a series of reports and on the webpage for each tool.
Their short-term forecasts of COVID-19 deaths are based on a complex SEIR model as explained below:
Institute of Health Metrics and Evaluation (IHME)
The IHME makes projections of numbers of COVID-19 related infections and deaths and the use of hospital resources (all beds, ICU beds, and invasive ventilators). The IHME uses a combination of statistical and mechanistic models. They update their estimates in real time as new data become available.