Database searches: For each protein a Smith-Waterman search was performed against the proteome database to retrieve a set of proteins with a significant similarity (e-value < 1e-05). Only sequences that aligned with a continuous region longer than 50% of the query sequence were selected. At most 150 sequences were taken. An artificial database size of 1.000.000 sequences was put to make comparable results with other phylomes in the database.
Multiple sequence alignment: Sets of homologous protein sequences were aligned using three different programs: MUSCLE v3.8.31, MAFFT v6.814b and KALIGN. Alignments were performed in forward and reverse direction and the six resulting alignments were combined using M-COFFEE (T-Coffee v8.80). The resulting alignment was trimmed using trimAl v1.3 using a consistency cutoff of 0.1667 and a gap score cutoff of 0.1.
Phylogenetic reconstructions: Phylogenetic trees were reconstructed using a Neighbour Joining approach as implemented in BioNJ. The likelihood of this topology was computed, allowing branch-length optimisation, using eight different models (JTT, WAG, MtREV, VT, LG, Blosum62, CpREV and DCMut), as implemented in PhyML 3.0. The evolutionary model best fitting the data was determined by comparing the likelihood of the used models according to the AIC criterion. A maximum likelihood tree was derived using the selected model. In all cases a discrete gamma-distribution model with four rate categories plus invariant positions was used, the gamma parameter and the fraction of invariant positions were estimated from the data.
Seed species: Lichtheimia corymbifera
Associated publication: Schwartze et. al. 2014. Gene Expansion Shapes Genome Architecture in the Human Pathogen Lichtheimia corymbifera: An Evolutionary Genomics Analysis in the Ancient Terrestrial Mucorales (Mucoromycotina) PLOS Genetics. 14(10)8