December was the worst month so far in Massachusetts during the Covid-19 pandemic. The test-confirmed daily cases averaged roughly 4,500, according the DPH dashboard. January and February likely will be worse, as the infectious effects of holiday gatherings show themselves and the more highly transmissible Covid-19 mutation shows up.
Gathering indoors is the major risk
The major Covid-19 risk remains gathering indoors for extended times (anything over perhaps 20 minutes) in relatively small spaces, like homes or restaurants, where there is no realistic chance of providing virus-free air and effective distancing.
Also, while personal cloth masks do cut down transmission and reception of virus, they do not provide the robust protection of N95 respirator masks needed for high-risk exposure. N95s are worn in hospitals and in our office during all close patient contact (and face shields to boot).
Can I measure risk in my small group?
What you really are concerned about is the likelihood of an infectious but asymptomatic person being in one of your indoor gatherings, thereby potentially infecting many if not all present. It only takes one, as we have all seen.
We can in fact quantify that risk reasonably well. In brief, we first determine the incidence of asymptomatic but actively infectious people in the larger population. We then calculate the number of infectious days generated by each new case, which will allow us to calculate the prevalence of infectious cases. Then we use the binomial distribution to calculate the likelihood of at least one such infected person being in a group of given size (like the 15-person family holiday dinner).
Determining the active disease incidence
Incidence is the total new daily (or weekly or annual) infectious cases divided by the population. We first look at the number of Covid-19 cases reported by DPH and then derive the total cases including the asymptomatic ones.
DPH-reported cases are based on positive PCR or antigen tests performed overwhelmingly on symptomatic people. However, several reputable international studies earlier this year, after extensive population testing, showed that about 40% of all Covid-19 infections are asymptomatic, while 60% of infected people have symptoms. This is what makes the spread of the disease so sneaky.
With this knowledge, we can readily estimate the total incidence of infections by adding 67% to the newly documented symptomatic ones DPH reports.
How do we estimate infectiousness?
We do have a pretty good handle on when people are highly infectious.
For people who get Covid-19 symptoms, the 2 days before symptom onset and the day of symptom onset are highly infectious while people are unaware or just becoming aware they might be ill. So, each new DPH-reported symptomatic case generates 3 days of infectious exposure to others before the person either goes in quarantine or would be detected by symptom screening or getting a positive test.
For the truly asymptomatic 40%, their effective infectious duration is longer because they are likely never detected and excluded from activities. Studies show that the first five days of the Covid-19 active disease are highly contagious, as well as the 2 days before, making a total of 7 days of high infectivity for each new case of asymptomatic disease.
Estimating the infectious prevalence
If we now add up the total days of infectiousness generated by a single day’s new Covid-19 cases, we get a pretty good estimate of the prevalence of Covid-19 infectiousness in the population, which is the indicator we are interested in.
We can now calculate the prevalence of actively infectious people in the population — new symptomatic infections x 3 plus 67% of new symptomatic infections x 7 all divided by the Massachusetts population of 6,892,000. This estimate is of course not perfect, since each number cited has some uncertainty (except for our state population), but certainly good enough for our purpose of estimating infectious risk in small gatherings.
Applying risk calculation to small groups
We now have carefully estimated numbers of infectious but presymptomatic or asymptomatic people in this state in recent months. What we want to know is the likelihood that at least one of them will be silently present in your group gathering in a home or restaurant and put you all at risk of Covid-19.
The binomial distribution (which perhaps a few of you have heard of and even fewer remember specifically) can answer just that question for different group sizes. It takes the probability of a specific event occurring in a general population and derives the likelihood that same event will be present in smaller groups. In this case the event is the presence of a at least one asymptomatic but infectious individual.
Actual data from Massachusetts this fall
In the table below, Line 1 shows the average daily positive tests by month. In Line 2 we derive the number of infectious but presymptomatic days for those who tested positive when they were symptomatic (3 each). Line 3 shows the number of positive tests expected from asymptomatic people derived (67%) from Line 1. Line 4 shows the calculated number of infectious days from the asymptomatic group in Line 3 (7 each). Finally, Line 5 shows the average total infectious but asymptomatic days each month.
Line 6 displays the key asymptomatic infectiousness prevalence data, which is simply Line 5 divided by the current Massachusetts population of 6,892,000.
Finally, Line 7 is the inverse of the prevalence number, for those who find it easier to think, for example, of 1 in 200 rather than 0.0050 (5 thousandths).
Aug ’20 | Sep ’20 | Oct ’20 | Nov ’20 | Dec ’20 | |
1. Daily POS tests (with Sx) | 250 | 350 | 1,000 | 2,500 | 4,500 |
2. Infectious days pre-Sx@3 | 750 | 1,050 | 3,000 | 7,500 | 13,500 |
3. Daily POS tests (NO Sx) | 168 | 235 | 670 | 1,675 | 3,015 |
4. Infectious days NO Sx@7 | 1,173 | 1,642 | 4,690 | 11,725 | 21,105 |
5. Total infectious NO Sx days | 1,923 | 2,692 | 7,690 | 19,225 | 34,605 |
6. PREVALENCE infectious NO Sx | 0.0003 | 0.0004 | 0.0011 | 0.0028 | 0.0050 |
7. PREVALENCE one person in: | 1:3585 | 1:2561 | 1:896 | 1:358 | 1:199 |
Now to calculate your small group risk
We can at last calculate the likelihood that at least one person who is actively infectious but asymptomatic will be present in groups of different sizes. We use the binomial distribution, a formula that applies population statistics to small groups, with the assumption that the events at issue (in this case, at least one asymptomatic infected individual) are more or less randomly distributed.
I provide the results for two selected months in the following table. We choose December (Row 1), a very high-risk month with a prevalence of 0.0050 (about 1:200 people is infected but asymptomatic) to compare with September (Row 2), a low-risk month with a prevalence of 0.0004 (about 1:3585 is infected but asymptomatic). The Group Size for each column is labeled at the top.
The cells contain the results of using the binomial distribution to calculate the percent probability for at least one person actively infectious in a group of given size for December and for September 2020.
Group Size | 6 | 10 | 15 | 20 | 30 | 50 | 100 |
Dec: 0.0050 | 3.0% | 4.9% | 7.2% | 9.5% | 14.0% | 22.2% | 39.4% |
Sep: 0.0004 | 0.2% | 0.4% | 0.6% | 0.8% | 1.2% | 2.0% | 3.9% |
For example, using this results table, in December a group of 10 people randomly gathering had just under a 5% chance of having an actively infectious member. Make that a group of 20 (half the family just from New England getting together) and the risk is 9.5%. By contrast, in September both those risks were well under 1%.
Did you think the December risks were this high? I suspect not. At even 5% risk levels for indoor events without enormous air circulation, your cloth masks and distancing will not seriously protect you from infection over a 30-minute to 90-minute gathering.
January and February risk will remain high
Keep in mind that this month (January 2021) is likely to be as high risk as December, and February may well be worse. Just watch the daily state case numbers as they evolve.
The big takeaway, again, is that it is not now safe to be indoors in groups other than with immediate household members.
Optimism for second quarter
Thank you for persisting in reading this statistical analysis. I hope it will inform your personal decision-making about safety and provide you strength to get through this winter, which will end. I promise.
Keep your spirits up. By the second quarter we should have completed meaningful numbers of Covid-19 vaccinations and the weather will be better. We can start to be outside, where the inherent infectious risk levels will really drop.
Let us all wish for the Spring, and for an effective vaccine rollout.
Thanks to Prof. David Ortmeyer
Prof. David Ortmeyer teaches statistics in the Dept. of Economics, Bentley University. He was of immense help in reeducating me about the wonderful binomial distribution. He bears no responsibility for any errors or imprecisions in this analysis.
Phenomenal brief. This is great. Have you and David considered posting this on Medium or sending over to the Globe at least?
We considered but didn’t have time to do so. Thanks for the support.
Hi Dr Kanner
Thank you for sharing this important information! Can you share the formula you used using (using binomial distribution) to calculate small group size? Trying to calculate risk using current COVID numbers.
Thanks!
David, I used the binomial formula binom.dist in Excel to calculate the probability that exactly no one would be infected, for a given probability of success being an infection (using the case rate I calculated from the statewide data) and specifying the number of people in the group (the number of trials). One sets the syntax to “true” which means the binomial formula then calculates the exact probability that no one is infected. If that probability is then subtracted from 1, the result is the probability that one or more persons is infected (“at least 1”). I thought that was the most useful formulation and easiest to do. Likelihood of 2 infections in a group of 20 is very small, and wouldn’t really change your risk.
Here is a link for the Excel article explaining the binom.dist function:
https://support.microsoft.com/en-us/office/binom-dist-function-c5ae37b6-f39c-4be2-94c2-509a1480770c?ns=excel&version=90&ui=en-us&rs=en-us&ad=us