July 18, 2020
Reviewed by Jennifer Brown, M.D.
Seven members of the Coronaviridae family are known to infect humans. Of these, SARS-CoV-2, SARS-CoV, and MERS-CoV are associated with more severe disease than other coronaviruses which typically cause mild “common cold” symptoms. How coronaviruses develop increased pathogenicity and zoonotic transmission is not well understood.
To address these knowledge gaps, Gussow et al. used integrated comparative genomics and machine learning to analyze coronavirus genomes, reporting their results in the Proceedings of the National Academy of Sciences. Nucleotide sequences of 3,001 coronavirus genomes (944 genomes from viruses that infect humans) and encoded protein sequences were obtained from the National Center of Biotechnology Information. The 944 human coronavirus genomes were aligned via an alignment program to identify deletions and insertions, then dimensionality reduction and multiple support vector machines were used to search for regions that differed between coronaviruses with low and high case fatality rates (CFR). SARS-CoV-2, SARS-CoV, and MERS-CoV were considered high-CFR viruses.
Overall, 11 regions of nucleotide alignments seemed to be predictive of high-CFR coronaviruses, and two proteins (the nucleocapsid phosphoprotein and the spike glycoprotein) were found to contain these predictive regions. The changes in the nucleocapsid protein of the high-CFR viruses mapped to two nuclear localization signals and a nuclear export signal. Structural analysis of the spike protein (which binds to the host ACE2 receptor) showed a unique insertion in the high-CFR viruses, but not the low-CFR viruses. When the genomes from high-CFR human coronaviruses were compared to the genomes from the nonhuman host prior to zoonotic transmission, insertions in the spike protein (in the receptor binding subdomain that binds ACE2) were found.
Though these findings require further investigation, the authors propose that the changes in the nuclear localization signals may increase coronavirus pathogenicity, and the insertions in the spike protein may increase coronavirus virulence and zoonotic transmission by influencing the viral-host membrane fusion process.