just dans

Making programming more accessible

22 Feb 2021

Pandemic Response as Incident Response

During an incident, when error and latency graphs start trending downwards, you should start writing down everything in your mind right then (it will fall out of your head shortly) and schedule the postmortem. Hold the postmortem even if you don’t reach all-clear; you’ll find the dangling threads in the course of reviewing the incident.

With my daughter about to join her brothers at school for the first time in 353 days, now feels like that lull. Cases, hospitalizations, and deaths are still higher than we’d like but trending downward. We have to keep watching the graphs. They may spike again. Now is the time to write down what’s fresh on our minds.

What struck me as cases rose during the fall and through December was the lag in data. By the time cases and hospitalizations clearly started rising, the virus had been spreading at too high a rate for weeks. That left us with blunt choices: close schools or leave them open and pray. Was the spread happening among families with school-age children or different parts of the population?

If this were a postmortem for an internet service we’d build low-latency monitoring for transmission. At-home rapid antigen tests seem to be the tool for the job, though I’m open to other suggestions! You can find out whether you’re infections within 15-30 minutes. The more people who stay home while infectious, the slower the spread of the virus. Staying home when you’re infectious is key - going about your life for 24 to 72 hours before a lab tells you your PCR test is positive leaves plenty of opportunity for spread. Since these tests are so cheap and easy, you can test an entire community twice a week and have a handle of whether the virus is spreading within that community or elsewhere, meaning you can make opening/closing decisions with confidence instead of by feel.

Picking targets for new monitoring always involves some trial and error. Ideally we could target the reproduction number directly, but that’s hard to see in realtime. Saying we want zero cases in our schools is too sensitive. Reducing cases, hospitalizations, and deaths by an order of magnitude would buy us time to respond to outbreaks. We also have many month’s worth of data: we can aim to have ten times fewer cases as we had during the lulls last year. And we can aim to have ten times fewer cases during next winter’s peak.

In the case of our kid’s school system, there has been one confirmed case of in-school transmission since in the past six weeks. Driving that to 0.1 in-school transmissions per six weeks is indistinguishable from zero. We’d need to look at some measure of student-days (add up the number of students who attend school each day) and increase that up to an order of magnitude while holding transmission constant.