Over time, new variants of a virus are expected to occur as all viruses constantly mutate their genetic code. These variants can emerge and disappear spontaneously, though sometimes selective pressure allows the variant to persist through adaptive evolution. Multiple variants of the SARS-CoV-2 virus that causes COVID-19 have been documented globally, but of emerging concern is a variant first identified in the United Kingdom with an unusually large number of mutations.
Lineage B.1.1.7 (known as 20B/501Y.V1 or VOC 202012/01, which indicates the first variant of concern in December 2020) carries 23 mutations compared to the wild-type SARS-CoV-2. This includes 14 non-synonymous (amino acid altering) mutations, six synonymous (non-amino acid altering) mutations, and three deletions (Table 1).1,2 17 of these mutations lead to changes in protein structures, eight of which occur in the spike protein (Figure 1). This is highly unprecedented as most SARS-CoV-2 variants have only a few mutations that accumulate at a relatively consistent rate over time (~1-2 per month).
Table 1. Amino Acid Substitutions and Deletions in B.1.1.7 Lineage
Protein | Amino Acid |
ORF1ab | T1001I |
A1708D | |
I2230T | |
SGF 3675-3677 deletion | |
C913T | |
C5986T | |
C14676T | |
C15279T | |
C16176T | |
Spike | H69-V70 deletion |
Y144 deletion | |
N501Y | |
A570D | |
P681H | |
T716I | |
S982A | |
D1118H | |
ORF8 | Q27stop |
R52I | |
Y73C | |
N | D3L |
S235F | |
M | T26801C |
Figure 1. Compilation of SARS-CoV-2 spike mutations occurring in the new UK, South African, and Brazilian variants as indicated in the inset. NTD: N-terminal domain. RBD: Receptor binding domain.
One of the most important changes in B.1.1.7 is a mutation in the receptor binding domain (RBD) of the spike protein at position 501, where asparagine has been replaced with tyrosine (N501Y). According to prior work on variants with N501Y, this substitution is thought to increase the affinity of the spike protein for the angiotensin-converting enzyme 2 (ACE2) receptor on human cells, its entry point for infection.3 Indeed, the B.1.1.7 variant is estimated to be 74% more transmissible than the wild type and is projected to become the dominant source of infection in the US by March 2021.
Other notable mutations in the B.1.1.7 variant include a double deletion of the amino acids in positions H69-V70 in the N-terminal domain of the spike protein that likely leads to a conformational change in the spike protein. This particular mutation has also been found in coronaviruses that eluded the immune response in some immunocompromised patients and appeared in infected mink in Denmark in August 2020.4 A study from the University of Cambridge has suggested this mutation increases infectivity two-fold in in vitro experiments and may make neutralizing antibodies less effective.5
A P681H mutation occurs at one of the four residues that comprise the insertion that creates a furin cleavage site between S1 and S2 in the spike protein. This site, which facilitates viral membrane fusion to host cells, has been shown to promote entry into respiratory epithelial cells and transmission in animal models.6-8
The function of a stop codon (Q27stop) added to the open reading frame 8 (ORF8) that renders it inactive is not completely known and may allow further downstream mutations to accrue. This gene is hypervariable with a tendency to recombine and undergo deletions that facilitate viral adaptation to the human host. The ORF8 gene encodes for an immunoglobulin-like protein that was recently found to inhibit the presentation of viral antigens by class I major histocompatibility complex, suppress the type I interferon antiviral response, and interact with host factors involved in pulmonary inflammation and fibrogenesis.9 A full deletion of ORF8 in a variant identified in Singapore has been associated with milder COVID-19 symptoms and better disease outcome.10
In South Africa, another variant of SARS-CoV-2 (known as 20C/501Y.V2 or B.1.351 lineage) emerged independently of the B.1.1.7 lineage. This variant has eight defining mutations in the spike protein including K417N, E484K, and N501Y substitutions in the spike protein RBD that increase affinity for the ACE2 receptor (Table 2, Figure 1).11,12 Although it contains the N501Y mutation found in the B.1.1.7 variant, it does not contain the deletion at H69-V70. The E484K mutation has been associated with escape from neutralizing antibodies.13
Table 2. Amino Acid Substitutions and Deletions in B.1.351 Lineage
Protein | Amino Acid |
Spike | L18F |
D80A | |
D215G | |
L242_244L (disputed deletion) | |
R246I | |
K417N | |
E484K | |
N501Y | |
A701V |
Circulating in the Brazilian Amazon region, a branch off the SARS-CoV-2 B.1.1.28 lineage termed P.1 (or 20J/501Y.V3) includes several mutations of known biological importance such as E484K, K417T, and N501Y in the RBD of the spike protein but is of independent origin from B.1.1.7 and B.1.351.14 This variant has ten defining mutations in the spike protein (Table 3, Figure 1).15 The B.1.1.28 variant was associated with two cases of reinfection in patients originally infected by the Brazilian B.1.1.33 lineage, and the P.1 variant was identified in Japan from travelers returning from northern Brazil.14,16
Table 3. Amino Acid Substitutions, Insertions, and Deletions in P.1 Lineage
Protein | Amino Acid |
ORF1ab | SynT733C |
SynC2749T | |
S1188L | |
K1795Q | |
Del11288-11296 (SGF 3675-3677 deletion) | |
SynC12778T | |
SynC13860T | |
E5665D | |
Spike | L18F |
T20N | |
P26S | |
D138Y | |
R190S | |
K417T | |
E484K | |
N501Y | |
H655Y | |
T1027I | |
ORF8 | E92K |
Ins28269-28273 | |
N | P80R |
The simultaneous emergence of different SARS-CoV-2 lineages from different countries around the world each carrying mutations in the spike protein receptor binding site reveal convergent selective pressure on SARS-CoV-2 to create an advantage towards its transmissibility and reinfection of individuals. This is a major concern as these mutations may evolve to escape neutralizing antibodies. Cayman is monitoring the rapidly emerging information on these variants and offers structure-based design services (SBDD) for macromolecular X-ray crystallography and computer-aided drug design services (CADD) for in silico screening of drug candidates that target the SARS-CoV-2 spike protein.