dN/dS calculation of Thermus virus NrdJm evolution
datasetposted on 29.01.2019 by Daniel Lundin, Christoph Loderer, Karin Holmfeldt
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
A log ratio test of significant overrepresentation of non-synonymous to silent (dN/dS) was estimated with codonml from PAML (Yang 1997, https://doi.org/10.1093/bioinformatics/13.5.555 ; Yang 2007, https://doi.org/10.1093/molbev/msm088) by running the program with a fixed and free dN/dS respectively, and the branch leading to the two Thermus virus sequences (marked with “#1” in reftree.newick) designated as the “foreground” branch in the free dN/dS run. This analysis was performed on the Thermus virus/Firmicutes subtree in https://doi.org/10.17045/sthlmuni.7117430.v2, using the same alignment as for the full tree, reverse translated into nucleotides. We could not find correct gene sequences for seven taxa (RefSeq accession numbers: WP_102410887, WP_033167051, WP_054955013, WP_065068364, WP_088370373, WP_087372021, WP_093315575), so they were left out of the analysis.
Scripts in files ending with .codeml; output in .codeml.out files.