TreeREx Overview

TreeREx is an algorithm for assigning rearrangements to the edges of a given phylogenetic tree and reconstructing ancestral gene orders at the inner nodes as described in:

M. Bernt, D. Merkle, and M. Middendorf
An Algorithm for Inferring Mitochondrial Genome Rearrangements in a Phylogenetic Tree
Comparative Genomics International Workshop, RECOMB-CG 2008, 5267 Lecture Notes in Bioinformatics (LNBI), 143-157, 2008


Datei Geändert am Größe
TreeREx 1.85 linux 32-bit 2013-01-07 11:24:46 am 865735 bytes
TreeREx 1.85 linux 64-bit 2013-01-07 11:24:30 am 911917 bytes
TreeREx test data 2013-01-07 11:22:20 am 1234 bytes


Send questions, comments, or requests to:


Running TreeREx

trex -h

gives an overview of available options

The recommended way to start TreeREx is with options -s -w -W this starts TreeREx as described as in the paper.

-d allows to specify a file for dot output

-o adds alternative scenarios consisting of transpositions and inversions determined from pattern in breakpoints.

-m is per default set to 2. A value of 0 sets it to unlimited. Be careful with this option as it might lead to an exponentially increased run time.


./trex -f echinodermata_circal.fas -t echinodermata_circal.phy -d -s -w -W > echi.out

./trex -f teleost.fas -d -s -w -W > teleost.out

Output can be found in the dot and out files.

Interpreting TreeREx text output

Each local reconstruction starts with a block like

# +++++++++++++++++++++++++++++++++++++++++++++++++
# trex( A2 )
# +++++++++++++++++++++++++++++++++++++++++++++++++

This says that TreeREx is now considering the subtree rooted at A2 in order to
resolve the unresolved children of A2 (e.g. A1 and A0). For the resolution TreeREx uses the highest resolved nodes that occur in the subtree.

Then there comes a line like

# 'leaf' set: 4 nodes

The number tells how many resolved nodes have been used. This number can be either 3 or 4.

Next comes a list of the the actual gene orders of the used resolved nodes and all pairwise rearrangement scenarios for them. The following examples explain the notation:

  1. T(a b c, e f, ) is a transposition exchanging the gene sets abc and ef. That is ef is transposed on the other side of abc (or equivalently: abc is transposed on the other side of ef). The order of the sets does not reflect the order of the elements in the gene orders.
  2. Similarly rT(a b c, e f ) is an inverse transpoistion of the elements abc.
  3. R(a b c ) is an inversion of the elements abc.
  4. TDL(abc, de) is a TDRL where the gene set abc is kept in the first copy and gene set de is kept in the second copy.

The rearrangements are included in unordered or ordered, i.e. CREx knows that the order of the rearrangements is irrelevant (unordered) or if CREx does not know or the order matters (ordered).

Next comes a description of the the actual reconstruction attempts that TreeREx made for the subtree. The description starts with the type of method that has been used (one of the following lines)

# --------------- 0-consistency ------------------- 
#---------------- k-consistency -------------------
# --------------- k-consistency (parsimonious) ---- 
# --------------- fallback (ohh ohh) ---------------

For 0-consistency the intersections of the pairwise scenarios are printed (“cut”) and the putative ancestral states that would be implied if the cut would be correct. Then it is written if only one or both of the nodes are reconstructed. In case that both nodes are reconstructed TreeREx is finished with the subtree. If only one node is reconstructed then TreeREx restarts for this node trying to reconstruct the other node.

For k-consistency the considered subset of nodes that is used for the reconstruction is printed. Then the intersections to this subset and the putative ancestral states are listed. This is done for each possible subset. At the end all possible putative ancestral states are printed. Then it is printed if a decision could be made (i.e., if a majority agrees on one the putative ancestral state).

The output for the parsimony k-consistency method looks the same as for k-consistency, but in addition the scores are printed at the end (i.e., for each combination of putative ancestral states: the number of rearrangements from the ancestral states to its children and among them).

For fallback the tested pairs of putative ancestral states are printed and the necessary score, i.e. the minimum number of rearrangements between the two gene orders (Note that the rearrangements to the children are not counted here).

At the end of the file the reconstructed ancestral states are printed and the rearrangements on the edges.

Interpreting TreeREx dot output

The dot file can be transformed in a pdf via dot -Tpdf > xyz.pdf.

Node colors:

  • green: consistent
  • yellow: k-consistent
  • red: inconsistent (fallback)

Node labels:

  • node name
  • C consistency value: The fraction of leafs not considered.
    For binary trees it is 0.33 for the case of 3 leaf subtrees, and 0.5 for the case of 4 leaf subtrees.
  • P parsimony value, i.e. how much better the solution was then a possible alternative only given if the solution is based on parsimony.

Usage of TreeREx

OV Popova, KV Mikhailov, MA Nikitin, MD Logacheva, AA Penin, MS Muntyan, OS Kedrova, NB Petrov, YV Panchin, VV Aleoshin
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals
PloS one, 2016

M. Babbucci, A. Basso, A. Scupola, T. Patarnello, E. Negrisolo
Is it an ant or a butterfly? Convergent evolution in the mitochondrial gene order of Hymenoptera and Lepidoptera
Genome Biology and Evolution, 2014

A Duò, R Bruggmann, S Zoller, M Bernt, CR Grünig
Mitochondrial genome evolution in species belonging to the Phialocephala fortinii sl-Acephala applanata species complex
BMC Genomics 2012 13 (1), 166