LysM phylogeny including Datisca glomerata and Ceanothus thyrsiflorus
datasetposted on 29.05.2018 by Daniel Lundin, Marco Guedes Salgado
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Phylogeny of LysM paralogs with sequences harvested from newly sequenced transcriptomes from Datisca glomerata and Ceanothus thyrsiflorus plus the following species as references: Arabidopsis thaliana, Cicer arietinum, Cucurbita pepo, Fragaria vesca, Glycine max, Lotus japonicus, Manihot esculenta, Medicago trunculata, Morus notabilis, Oryza sativa, Prunus persica, Ricinus communis, Solanum lycopersicum, Sorghum bicolor, Theobroma cacao, Vigna radiata and Ziziphus jujuba.
Sequences were collected with blastp searches against the RefSeq database subset by the above list of species, using presumed D. glomerata and C. thyrsiflorus orthologs.
Sequences were aligned with Clustal Omega (http://www.clustal.org/omega/; Sievers et al. 2014) and reliable alignment positions were selected with the BMGE algorithm (Criscuolo and Gribaldo 2010) using the BLOSUM62 substitution matrix. Sequences that were identical after BMGE, were discarded.
The tree was estimated with RAxML v. 8.2.4 (https://sco.h-its.org/exelixis/web/software/raxml/index.html; Stamatakis 2014) using the PROTGAMMAAUTO model and automatic bootstopping.
Files provided are:
1. The full alignment after BMGE selection of positions and removal of identical sequences: lysm.refseq_harvest_plus_selected.co.BLOSUM62.bmge.rx.red.phylip
2. The maximum likelihood tree labelled with bootstrap values in:
a. newick format: lysm.refseq_harvest_plus_selected.co.BLOSUM62.bmge.rx.red.PROTGAMMAAUTO.raxml.besttree.newick
b. Dendroscope (http://dendroscope.org/; Huson et al. 2007): lysm.refseq_harvest_plus_selected.co.BLOSUM62.bmge.rx.red.PROTGAMMAAUTO.raxml.bipartitions.nexml