Coronavirus phylomes
Background information
Coronaviruses cause respiratory and intestinal infections in animals and humans. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2; taxID: 2697049) is the virus responsible for causing the COVID-19 pandemic. It is classified within the Sarbecoronavirus subgenus of Coronaviridae viruses. It is very closely related to the SARS-CoV virus which caused an outbreak in 2002-2003. Both viruses likely originated in bats though it’s been seen that recombination may have played an important role in the emergence of new viruses and in particular of SARS-CoV-2.
Phylome reconstruction
In an effort to provide insights on the evolution of SARS-CoV-2 proteins we collected 60 genomes belonging to the Coronavirinae suborder. This suborder contains not only the SARS viruses but also the virus responsible for the Middle East respiratory syndrome (MERS-CoV). We also selected 11 genomes from the Ronidovirineae and Mesnidovirineae suborders to serve as outgroups. All viruses belong to the Nidovirales order.
A phylome consists of automatically building phylogenetic trees for each gene encoded in a genome. This offers a detailed view of the evolution at gene level and allows us to assess how each gene has evolved independently on how the species as a whole has evolved. The coronavirus phylomes were built using the most recent pipeline implemented in phylomeDB.
This pipeline differs from the pipeline previously published (Huerta-Cepas et al. 2011) in that it now uses iqtree (Nguyen et al. 2015) to build phylogenetic trees. Model selection is done by the iqtree model selector. While normally a preset group of models is used to limit the searches done by iqtree in this occasion all models implemented in iqtree were used to search for the best fitting model.
Three phylomes have so far been built for this collection: one for each of the three main viruses causing human illnesses that recently shifted host from other animal species: SARS-CoV-2, SARS-CoV and MERS-CoV.
Name | Seed species | Description | Updated |
---|---|---|---|
SARS-CoV-2 phylome | Severe acute respiratory syndrome coronavirus 2 | Phylome reconstructed for the virus causing COVID-19 | 2020-04-21 |
SARS-CoV phylome | Severe acute respiratory syndrome-related coronavirus | Phylome reconstructed for the virus causing SARS | 2020-04-21 |
MERS-CoV phylome | Middle East respiratory syndrome-related coronavirus | Phylome reconstructed for the virus causing MERS | 2020-04-21 |
Gene | Tree | Protein | Description |
---|---|---|---|
Gene 1ab | YP_009724389.1 | Spans two-thirds of the genome of Coronavirus and it is a polyprotein that encodes the non-structural machinery of the virus* | |
Gene 1a | YP_009725295.1 | Spans two-thirds of the genome of Coronavirus and it is a polyprotein that encodes the non-structural machinery of the virus* | |
Gene S | YP_009724390.1 | Encodes a structural spike glycoprotein | |
Gene N | YP_009724397.2 | Encodes the nucleocapsid proteins | |
Gene E | YP_009724392.1 | Encodes the envelope protein: | |
Gene M | YP_009724393.1 | Encodes the membrane glycoproteins | |
ORF3 | YP_009724391.1 | Additional ORF | |
ORF6 | YP_009724394.1 | Additional ORF | |
ORF7a | YP_009724395.1 | Additional ORF | |
ORF7b | YP_009725296.1 | Additional ORF | |
ORF8 | No tree** | YP_009724396.1 | Additional ORF |
ORF10 | No tree** | YP_009725255.1 | Additional ORF |
* Gene 1 translates into two overlapping ORFs: 1a and 1ab. 1ab results from a frame shifting event at the ORF 1a / 1b junction. This effect causes differences in the annotations of the different coronaviruses as most only annotated gene 1ab. Note that this will directly affect the phylogenetic trees for genes 1ab and 1a.
** No trees are available as these genes did not have homologs in the other coronaviruses, according to our thresholds.
Citation
If you use this information, please cite the the following PhylomeDB reference in your work:
Huerta-Cepas J, Capella-Gutiérrez S, Pryszcz LP, Marcet-Houben M, Gabaldón T. PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome. Nucleic Acids Res. 2014;42(Database issue):D897–D902. doi:10.1093/nar/gkt1177