How Genomic Surveillance is Changing the Game in Epidemiology

The SARS-CoV-2 pandemic has been the most significant collective global challenge since World War II. The last major pandemic that was truly global, the Spanish Flu outbreak at the end of World War I, came at the dawn of a century that was still using mostly 19th-century medical techniques.

A century later, the COVID-19 pandemic arrived at a time where decades of medical advances have given us the capabilities to mitigate the worst possible scenarios in terms of fatality and long-term debilitation, in spite of the considerable toll it has already taken in the last eighteen months.

In addition to the advances in universal health coverage in most industrialized countries (with the notable exception of the USA), medical treatment, intensive care, respiratory management, cardiology, pulmonology, there is an equally crucial tool in the global arsenal for combatting COVID-19: genomic surveillance.

Genomic surveillance is an applied technique derived from a process that has gone from science-fiction to routine in the last 50 years, namely DNA sequencing. DNA sequencing involves using various biochemical techniques to parse, from a live sample, the sequence of nucleotides in an organism’s DNA or RNA. This data can be structured just like computer-readable data, with the typical 1s and 0s of binary code represented in biological data by four predictable nucleic acids “(G,A,T,C)”.

The first full SARS-CoV-2 genome was published in January 2020, right at the start of the pandemic, as it was emerging in Wuhan but prior to its observed spread to the rest of the world. The sequencing of the COVID viral genome — which can now be conducted in a matter of days for a few hundred dollars — created the environment where biotech companies such as Moderna and others could actually develop a corresponding vaccine in a remarkably short amount of time. This marshalling of a wartime-level effort for vaccine development and deployment is one of the most significant achievements in medicine and public health in recent memory. Unfortunately, evolution has made the path forward less straightforward than the simple recipe of “sequence the virus” and “make a vaccine that ends the pandemic”.

As many reading the news in 2021 will have understood all too well, new strains of SARS-CoV-2 — “descendants” of the original virus that have experienced mutations to their genetic code — are now causing significant problems. Genomic surveillance involves the continual sequencing of large numbers samples in order to track genetic variants in the virus, allowing epidemiologists and other health professionals to understand how the virus is changing and where new risks may emerge. For instance, the COVID-19 spike protein, the key on the original coronavirus that unlocks the pathway to wreak havoc in respiratory cells in the body, can mutate leading to new viral strains such as the highly infectious delta variant currently driving the pandemic. Information about these genetic changes can only be obtained through DNA sequencing and it is critical since many vaccines are designed to specifically target these proteins and block viral infection.

In an ideal model of genomic surveillance, game-changing improvements could be made to public and individual health care by sequencing every human, virus and bacteria, providing a Rosetta Stone for combatting and eliminating many diseases. But barriers to genomic surveillance exist:

Data scale — Only a few countries have analyzed more than 5% of COVID-19 cases. Orders of magnitude higher coverage are a non-trivial matter of increased cost, logistical overhead and patient consent.

Analytic tools — While companies like Illumina and others have made the physical sequencing of DNA and RNA a commoditized services, the analysis industry is still in its infancy. Highly fragmented analytic tools make global cross-functional collaboration on a pandemic difficult.

Data silos — Critically important clinical and epidemiological data is often sitting inside “silos” — inaccessible, unlinked databases that reside within a private company’s electronic medical record

Resource parity — As is true with many industries, some nations and regions are far wealthier and far better resourced than others to undertake genomic surveillance. Viruses do not respect national borders and strains can mutate indiscriminately in areas with low surveillance just as easily in areas with high surveillance. To truly achieve genomic surveillance of global pandemics, providing access to the required resources should be a priority concern.

In summary, our ability to rapidly and globally track the “delta”, “lambda” or other variants of the original SARS-CoV-2 virus is a remarkable achievement. With concerted efforts from government, the private sector, universities and individuals, we may be able to use this opportunity to supercharge our efforts in genomic surveillance in the next decade to ensure the next pandemic can be managed and mitigated with unprecedented precision.