UNIVERSITÄT LEIPZIG
Fakultät für Mathematik und Informatik Institut für Informatik
Parallelverarbeitung und Komplexe Systeme


Tarzan Phylogeniesoftware zur Ermittlung von Cophylogenien

Beispielanwendungen:
Cophylogenie von Wirt- und Parasitbämen
Cophylogenie von Gen- und Speziesbäumen

Weitere Informationen in:
D. Merkle, M. Middendorf: Reconstruction of the Cophylogenetic History of Related Phylogenetic Trees with Divergence Timing Information
Theory in Biosciences, 123(4): 277-299, 2005

Download: Tarzan-v0.9.jar
Start Tarzan with java -cp Tarzan-v0.9.jar jungle/Tarzan
Preferred java version is jdk1.3

An example of a tree pair that be can used with Tarzan can be downloaded here.
Some corresponding explanations can be found here

Tarzan Entwickler-Team: Steffen Junick, Daniel Merkle, Martin Middendorf
(Ein Dank geht an Roman Legat für seine Arbeit an einer ersten Version von Tarzan)

Der Name Tarzan ist inspiriert durch die Tatsache, dass Tarzan Rekonstruktionen als Teilbäume einer Datenstruktur findet, die Assoziationen von Knoten im Parasitenbaum mit Knoten oder Kanten im Witrsbaum enthält (eine ähnliche Datenstruktur wird vom Programm TreeMap verwendet und wurde von M.A. Charleston [Math. Biosciences, 149 (1998)] Jungle genannt.


Short description:
Tarzan uses an event-based method to find cost minimal or reconstructions or reconstructions that have a minimal (or maximal) number of certain evolutionary events.

Five different types of evolutionary events are considered: cospeciation, duplication, sorting, switching, and extinction. For host parasite systemes cospeciation events refer to simultaneous host and parasite speciation, duplication events are independent parasite speciations, sorting events correspond to lineage sorting, and switches correspond to host shifts.


Screenshots:
Screenshot showing the four main windows of Tarzan.

Shown is a cost minimal reconstruction for a small gopher lice example.

Screenshot showing the shifting of a switch due to the ranks of the nodes. Shifted switches are drawn pink. The landing site is shifted, such that parasite 5 lands before node 25. Landing on the edge between node 25 and 27 is not allowed.

Tarzan has a graphical user interface that consists of the following four main windows.

1. Tree editor window: to define and edit interactively the phylogenetic trees; nodes of the trees can be labelled, e.g., with corresponding species names; divergence times can be defined by a time zone labelling for one tree and a time interval labelling for the other tree; mapping function Phi defines the current relations between the leaves of one tree and nodes of the other tree can simply be defined by drawing lines between the related nodes; lternatively, the trees, their names, the divergence time information, and the mapping function can also be defined by modifying a corresponding text file.

2. Association triple viewer: shows the candidate data structure containing the association triples (can be calculated after the phylogenetic trees and the mapping function have been defined).

3. Reconstruction table window: shows the calculated reconstruction with the number of different types of events and the resulting costs (after the event costs have been set); can show all reconstructions or only the cheapest reconstructions.

4. Reconstruction viewer window: by double clicking a row in the reconstruction table window the corresponding reconstruction is depicted in this window (moreover, in the association triple viewer all associations triples used for the reconstruction are marked). The listed reconstructions ca be sorted with respect to costs or number of the diferetn events (by clicking on the corresponding column head)

Note: Switches can lead to timing incompatibilities within a reconstruction. Therefore, Tarzan automatically checks every reconstruction for switch incompatibilities and tries to resolve them by pulling back the landing site of switches so that only a minimal number of sortings have to be introduced. But because the corresponding problem is NP-complete and to have a fast tool it is not guaranteed that Tarzan can resolve all incompatibilities (see the paper for more details). Incompatibilities between switches that have been resolved and the corresponding possible move back operations are listed by Tarzan.

Additional features of Tarzan are:
i) Tarzan offers not only the possibility to compute any cheapest reconstruction but can also compute reconstruction that are computed to other criteria. Moreover, a hierarchy of criteria can be defined (in that case optimization is done first with respect to the most important criterion, then within all found optimimal solutions optimization is done with respect to the second highest criterion and so forth. Possible criteria are (in each case minimum or a maximum is possible): cost, number of cospeciations, number of dupliciations, number of sortings, number of host switches, number of extictions
ii) The maximal number of cheapest reconstructions that are be computed by Tarzan can be set by the user.
iii) Tarzan can also list all possible reconstructions which could be interesting for cases where not too many reconstructions exist.

Steps to use Tarzan:

1. Start Tarzan with java -cp Tarzan-v0.9.jar jungle/Tarzan
2. Create or load new trees in the tree editor window
3. Chose build data structure under Build (this opens a Reconstrcutions Viewer)
4. Check or change cost table and minimization criteria under Options> in the new Reconstrcutions Viewer
5. List all or only the cost minimal reconsructions (under View in the Reconstrcutions Viewer (this opens a reconstruction table)
6. Double click on a row for a contruction you are interested (this opens a Reconstruction Viewer)

More information:

-> Steps to use Tarzan in more detail

-> Functions available in Tarzan windows

Note that the cost assigment used in Tarzan and TreeMap differ. E.g. the follwing cost assignment is often used in the literature. In Tarzan the cospeciation are costs -2, duplications costs are 2, sorting costs are 1, and switch costs are 2. This correspond to the cost assignment c=-1 , d=1, s=1, h=1 (switch) of TreeMap.

More information on how reconstructions are computed and how costs are defined in Tarzan can be found in:
->Tarzan: Reconstructions and cost model

-> Some problems and hints when using Tarzan

Some other webpages that mention Tarzan:

Phylogeny Programs list that is maintained by Joe Felsenstein

Genamics list of phylogeny analysis software

BiologyBrowser list of sources on evolution and phylogeny

Review of "Tangled Trees" by David A. Morrison, Newsletter 128 of the Australian Botany Society, page 32, 2006.


Home Last change 4.1.2007