TreeREx

Overview

TreeREx is an algorithm for assigning rearrangements to the edges of a given phylogenetic tree and reconstructing ancestral gene orders at the inner nodes as described in:

M. Bernt, D. Merkle, and M. Middendorf
An Algorithm for Inferring Mitochondrial Genome Rearrangements in a Phylogenetic Tree Comparative Genomics International Workshop, RECOMB-CG 2008, 5267 Lecture Notes in Bioinformatics (LNBI), 143-157, 2008 link

Downloads

GitHub link

TreeREx 1.85 linux 32-bit
TreeREx 1.85 linux 64-bit
TreeREx test data

Contact

Send questions, comments, or requests to:

README

Running TreeREx

trex -h

gives an overview of available options

The recommended way to start TreeREx is with options -s -w -W this starts TreeREx as described as in the paper.

-d allows to specify a file for dot output

-o adds alternative scenarios consisting of transpositions and inversions determined from pattern in breakpoints.

-m is per default set to 2. A value of 0 sets it to unlimited. Be careful with this option as it might lead to an exponentially increased run time.

Examples

./trex -f echinodermata_circal.fas -t echinodermata_circal.phy -d echi.dot -s -w -W > echi.out

./trex -f teleost.fas -d teleost.dot -s -w -W > teleost.out

Output can be found in the dot and out files.

Interpreting TreeREx text output

Each local reconstruction starts with a block like

# +++++++++++++++++++++++++++++++++++++++++++++++++
# trex( A2 )
# +++++++++++++++++++++++++++++++++++++++++++++++++

This says that TreeREx is now considering the subtree rooted at A2 in order to resolve the unresolved children of A2 (e.g. A1 and A0). For the resolution TreeREx uses the highest resolved nodes that occur in the subtree.

Then there comes a line like

# 'leaf' set: 4 nodes

The number tells how many resolved nodes have been used. This number can be either 3 or 4.

Next comes a list of the the actual gene orders of the used resolved nodes and all pairwise rearrangement scenarios for them. The following examples explain the notation:

T(a b c, e f, ) is a transposition exchanging the gene sets abc and ef. That is ef is transposed on the other side of abc (or equivalently: abc is transposed on the other side of ef). The order of the sets does not reflect the order of the elements in the gene orders.
Similarly rT(a b c, e f ) is an inverse transpoistion of the elements abc.
R(a b c ) is an inversion of the elements abc.
TDL(abc, de) is a TDRL where the gene set abc is kept in the first copy and gene set de is kept in the second copy.

The rearrangements are included in unordered or ordered, i.e. CREx knows that the order of the rearrangements is irrelevant (unordered) or if CREx does not know or the order matters (ordered).

Next comes a description of the the actual reconstruction attempts that TreeREx made for the subtree. The description starts with the type of method that has been used (one of the following lines)

# --------------- 0-consistency ------------------- 
#---------------- k-consistency -------------------
# --------------- k-consistency (parsimonious) ---- 
# --------------- fallback (ohh ohh) ---------------

For 0-consistency the intersections of the pairwise scenarios are printed (“cut”) and the putative ancestral states that would be implied if the cut would be correct. Then it is written if only one or both of the nodes are reconstructed. In case that both nodes are reconstructed TreeREx is finished with the subtree. If only one node is reconstructed then TreeREx restarts for this node trying to reconstruct the other node.

For k-consistency the considered subset of nodes that is used for the reconstruction is printed. Then the intersections to this subset and the putative ancestral states are listed. This is done for each possible subset. At the end all possible putative ancestral states are printed. Then it is printed if a decision could be made (i.e., if a majority agrees on one the putative ancestral state).

The output for the parsimony k-consistency method looks the same as for k-consistency, but in addition the scores are printed at the end (i.e., for each combination of putative ancestral states: the number of rearrangements from the ancestral states to its children and among them).

For fallback the tested pairs of putative ancestral states are printed and the necessary score, i.e. the minimum number of rearrangements between the two gene orders (Note that the rearrangements to the children are not counted here).

At the end of the file the reconstructed ancestral states are printed and the rearrangements on the edges.

Interpreting TreeREx dot output

The dot file can be transformed in a pdf via dot -Tpdf xyz.dot > xyz.pdf.

Node colors:

green: consistent
yellow: k-consistent
red: inconsistent (fallback)

Node labels:

node name
C consistency value: The fraction of leafs not considered.
For binary trees it is 0.33 for the case of 3 leaf subtrees, and 0.5 for the case of 4 leaf subtrees.
P parsimony value, i.e. how much better the solution was then a possible alternative only given if the solution is based on parsimony.

Usage of TreeREx

D Zhang et al
Three new Diplozoidae mitogenomes expose unusual compositional biases within the Monogenea class: implications for phylogenetic studies
BMC Evolutionary Biology, 2018
Link

BY Zheng, LJ Cao, P Tang, K van Achterberg, AA Hoffmann, HY Chen, XX Chen, SJ Wei
Gene arrangement and sequence of mitochondrial genomes yield insights into the phylogeny and evolution of bees and sphecid wasps (Hymenoptera: Apoidea)
Molecular Phylogenetics and Evolution, 2018
Link

K Yoshizawa, KP Johnson, AD Sweet, I Yao, RL Ferreira, SLCameron
Mitochondrial phylogenomics and genome rearrangements in the barklice (Insecta: Psocodea)
Molecular Phylogenetics and Evolution, 2017
Link

A. Basso, M. Babbucci, M. Pauletto, E. Riginella, T. Patarnello, E. Negrisolo
The highly rearranged mitochondrial genomes of the crabs Maja crispata and Maja squinado (Majidae) and gene order evolution in Brachyura
Sci Rep., 7: 4096, 2017.
Link

B Egger, L Bachmann, B Fromm
Atp8 is in the ground pattern of flatworm mitochondrial genomes
BMC Genomics, 18:414, 2017
Link

OV Popova, KV Mikhailov, MA Nikitin, MD Logacheva, AA Penin, MS Muntyan, OS Kedrova, NB Petrov, YV Panchin, VV Aleoshin
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals
PloS one, 11(10): e0165072, 2016
Link

M. Babbucci, A. Basso, A. Scupola, T. Patarnello, E. Negrisolo
Is it an ant or a butterfly? Convergent evolution in the mitochondrial gene order of Hymenoptera and Lepidoptera
Genome Biology and Evolution, 6(12): 3326–3343, 2014
Link

A Duò, R Bruggmann, S Zoller, M Bernt, CR Grünig
Mitochondrial genome evolution in species belonging to the Phialocephala fortinii sl-Acephala applanata species complex
BMC Genomics, 13 (1):166, 2012
Link