Skip to main content
CommentaryCovid-19 EpidemicGeneral ScienceHealth Care PolicyPreventive MedicinePrimary Care

Calculating the Risk of Asymptomatic But Infectious Person in a Group

By January 4, 20214 Comments

Example binomial distribution

December was the worst month so far in Massachusetts during the Covid-19 pandemic. The test-confirmed daily cases averaged roughly 4,500, according the DPH dashboard. January and February likely will be worse, as the infectious effects of holiday gatherings show themselves and the more highly transmissible Covid-19 mutation shows up.

Gathering indoors is the major risk

The major Covid-19 risk remains gathering indoors for extended times (anything over perhaps 20 minutes) in relatively small spaces, like homes or restaurants, where there is no realistic chance of providing virus-free air and effective distancing.

Also, while personal cloth masks do cut down transmission and reception of virus, they do not provide the robust protection of N95 respirator masks needed for high-risk exposure. N95s are worn in hospitals and in our office during all close patient contact (and face shields to boot).

Can I measure risk in my small group?

What you really are concerned about is the likelihood of an infectious but asymptomatic person being in one of your indoor gatherings, thereby potentially infecting many if not all present. It only takes one, as we have all seen.

We can in fact quantify that risk reasonably well. In brief, we first determine the incidence of asymptomatic but actively infectious people in the larger population. We then calculate the number of infectious days generated by each new case, which will allow us to calculate the prevalence of infectious cases. Then we use the binomial distribution to calculate the likelihood of at least one such infected person being in a group of given size (like the 15-person family holiday dinner).

Determining the active disease incidence

Incidence is the total new daily (or weekly or annual) infectious cases divided by the population. We first look at the number of Covid-19 cases reported by DPH and then derive the total cases including the asymptomatic ones.

DPH-reported cases are based on positive PCR or antigen tests performed overwhelmingly on symptomatic people. However, several reputable international studies earlier this year, after extensive population testing, showed that about 40% of all Covid-19 infections are asymptomatic, while 60% of infected people have symptoms. This is what makes the spread of the disease so sneaky.

With this knowledge, we can readily estimate the total incidence of infections by adding 67% to the newly documented symptomatic ones DPH reports.

How do we estimate infectiousness?

We do have a pretty good handle on when people are highly infectious.

For people who get Covid-19 symptoms, the 2 days before symptom onset and the day of symptom onset are highly infectious while people are unaware or just becoming aware they might be ill. So, each new DPH-reported symptomatic case generates 3 days of infectious exposure to others before the person either goes in quarantine or would be detected by symptom screening or getting a positive test.

For the truly asymptomatic 40%, their effective infectious duration is longer because they are likely never detected and excluded from activities. Studies show that the first five days of the Covid-19 active disease are highly contagious, as well as the 2 days before, making a total of 7 days of high infectivity for each new case of asymptomatic disease.

Estimating the infectious prevalence

If we now add up the total days of infectiousness generated by a single day’s new Covid-19 cases, we get a pretty good estimate of the prevalence of Covid-19 infectiousness in the population, which is the indicator we are interested in.

We can now calculate the prevalence of actively infectious people in the population — new symptomatic infections x 3 plus 67% of new symptomatic infections x 7 all divided by the Massachusetts population of 6,892,000. This estimate is of course not perfect, since each number cited has some uncertainty (except for our state population), but certainly good enough for our purpose of estimating infectious risk in small gatherings.

Applying risk calculation to small groups

We now have carefully estimated numbers of infectious but presymptomatic or asymptomatic people in this state in recent months. What we want to know is the likelihood that at least one of them will be silently present in your group gathering in a home or restaurant and put you all at risk of Covid-19.

The binomial distribution (which perhaps a few of you have heard of and even fewer remember specifically) can answer just that question for different group sizes. It takes the probability of a specific event occurring in a general population and derives the likelihood that same event will be present in smaller groups. In this case the event is the presence of a at least one asymptomatic but infectious individual.

Actual data from Massachusetts this fall

In the table below, Line 1 shows the average daily positive tests by month. In Line 2 we derive the number of infectious but presymptomatic days for those who tested positive when they were symptomatic (3 each). Line 3 shows the number of positive tests expected from asymptomatic people derived (67%) from Line 1. Line 4 shows the calculated number of infectious days from the asymptomatic group in Line 3 (7 each). Finally, Line 5 shows the average total infectious but asymptomatic days each month.

Line 6 displays the key asymptomatic infectiousness prevalence data, which is simply Line 5 divided by the current Massachusetts population of 6,892,000.

Finally, Line 7 is the inverse of the prevalence number, for those who find it easier to think, for example, of 1 in 200 rather than 0.0050 (5 thousandths).

  Aug ’20 Sep ’20 Oct ’20 Nov ’20 Dec ’20
1.      Daily POS tests (with Sx)          250          350      1,000      2,500      4,500
2.      Infectious days pre-Sx@3          750      1,050      3,000      7,500    13,500
3.      Daily POS tests (NO Sx)          168          235          670      1,675      3,015
4.      Infectious days NO Sx@7      1,173      1,642      4,690    11,725    21,105
5. Total infectious NO Sx days      1,923      2,692      7,690    19,225    34,605
6. PREVALENCE infectious NO Sx 0.0003 0.0004 0.0011 0.0028 0.0050
7. PREVALENCE one person in: 1:3585 1:2561 1:896 1:358 1:199


Now to calculate your small group risk

We can at last calculate the likelihood that at least one person who is actively infectious but asymptomatic will be present in groups of different sizes. We use the binomial distribution, a formula that applies population statistics to small groups, with the assumption that the events at issue (in this case, at least one asymptomatic infected individual) are more or less randomly distributed.

I provide the results for two selected months in the following table. We choose December (Row 1), a very high-risk month with a prevalence of 0.0050 (about 1:200 people is infected but asymptomatic) to compare with September (Row 2), a low-risk month with a prevalence of 0.0004 (about 1:3585 is infected but asymptomatic). The Group Size for each column is labeled at the top.

The cells contain the results of using the binomial distribution to calculate the percent probability for at least one person actively infectious in a group of given size for December and for September 2020.

Group Size 6 10 15 20 30 50 100
Dec: 0.0050 3.0% 4.9% 7.2% 9.5% 14.0% 22.2% 39.4%
Sep: 0.0004 0.2% 0.4% 0.6% 0.8% 1.2% 2.0% 3.9%

For example, using this results table, in December a group of 10 people randomly gathering had just under a 5% chance of having an actively infectious member. Make that a group of 20 (half the family just from New England getting together) and the risk is 9.5%. By contrast, in September both those risks were well under 1%.

Did you think the December risks were this high? I suspect not. At even 5% risk levels for indoor events without enormous air circulation, your cloth masks and distancing will not seriously protect you from infection over a 30-minute to 90-minute gathering.

January and February risk will remain high

Keep in mind that this month (January 2021) is likely to be as high risk as December, and February may well be worse. Just watch the daily state case numbers as they evolve.

The big takeaway, again, is that it is not now safe to be indoors in groups other than with immediate household members.

Optimism for second quarter

Thank you for persisting in reading this statistical analysis. I hope it will inform your personal decision-making about safety and provide you strength to get through this winter, which will end. I promise.

Keep your spirits up. By the second quarter we should have completed meaningful numbers of Covid-19 vaccinations and the weather will be better. We can start to be outside, where the inherent infectious risk levels will really drop.

Let us all wish for the Spring, and for an effective vaccine rollout.


Thanks to Prof. David Ortmeyer

Prof. David Ortmeyer teaches statistics in the Dept. of Economics, Bentley University. He was of immense help in reeducating me about the wonderful binomial distribution. He bears no responsibility for any errors or imprecisions in this analysis.


  • Paul says:

    Phenomenal brief. This is great. Have you and David considered posting this on Medium or sending over to the Globe at least?

  • David Abrams says:

    Hi Dr Kanner
    Thank you for sharing this important information! Can you share the formula you used using (using binomial distribution) to calculate small group size? Trying to calculate risk using current COVID numbers.

    • DrKanner says:

      David, I used the binomial formula binom.dist in Excel to calculate the probability that exactly no one would be infected, for a given probability of success being an infection (using the case rate I calculated from the statewide data) and specifying the number of people in the group (the number of trials). One sets the syntax to “true” which means the binomial formula then calculates the exact probability that no one is infected. If that probability is then subtracted from 1, the result is the probability that one or more persons is infected (“at least 1”). I thought that was the most useful formulation and easiest to do. Likelihood of 2 infections in a group of 20 is very small, and wouldn’t really change your risk.

      Here is a link for the Excel article explaining the binom.dist function:

Leave a Reply