| ||||||||||||
| Tarzan Phylogeny software for the reconstruction of cophylogenies |
Example applications:
Cophylogeny of host trees and related parasite trees
Cophylogeny of gene trees and related species trees
More information in
D. Merkle, M. Middendorf: Reconstruction of the Cophylogenetic History
of Related Phylogenetic Trees with Divergence Timing Information
Theory in Biosciences, 123(4): 277-299, 2005
Download: Tarzan-v0.9.jar
Start Tarzan with java -cp Tarzan-v0.9.jar jungle/Tarzan
Preferred java version is jdk1.3
An example of a tree pair that be can used with Tarzan can be downloaded here.
Some corresponding explanations can be found here
Tarzan development team: Steffen Junick, Daniel Merkle, Martin Middendorf
(We thank Roman Legat for his work on the first version of Tarzan.)
The name Tarzan stems from the fact that the program finds reconstructions as subtrees within a data strcuture that consists of connected triples of associations between nodes in the parasite tree and nodes or edges in the host tree (a similar data structure that is used by the programe TreeMap has been called Jungle by M.A. Charleston [Math. Biosciences, 149 (1998)]).
Five different types of evolutionary events are considered: cospeciation, duplication, sorting, switching, and extinction. For host parasite systemes cospeciation events refer to simultaneous host and parasite speciation, duplication events are independent parasite speciations, sorting events correspond to lineage sorting, and switches correspond to host shifts.
Tarzan has a graphical user interface that consists of the following four main windows.
1. Tree editor window: to define and edit interactively the phylogenetic trees; nodes of the trees can be labelled, e.g., with corresponding species names; divergence times can be defined by a time zone labelling for one tree and a time interval labelling for the other tree; mapping function Phi defines the current relations between the leaves of one tree and nodes of the other tree can simply be defined by drawing lines between the related nodes; lternatively, the trees, their names, the divergence time information, and the mapping function can also be defined by modifying a corresponding text file.
2. Association triple viewer: shows the candidate data structure containing the association triples (can be calculated after the phylogenetic trees and the mapping function have been defined).
3. Reconstruction table window: shows the calculated reconstruction with the number of different types of events and the resulting costs (after the event costs have been set); can show all reconstructions or only the cheapest reconstructions.
4. Reconstruction viewer window: by double clicking a row in the reconstruction table window the corresponding reconstruction is depicted in this window (moreover, in the association triple viewer all associations triples used for the reconstruction are marked). The listed reconstructions ca be sorted with respect to costs or number of the diferetn events (by clicking on the corresponding column head)
Note: Switches can lead to timing incompatibilities within a reconstruction. Therefore, Tarzan automatically checks every reconstruction for switch incompatibilities and tries to resolve them by pulling back the landing site of switches so that only a minimal number of sortings have to be introduced. But because the corresponding problem is NP-complete and to have a fast tool it is not guaranteed that Tarzan can resolve all incompatibilities (see the paper for more details). Incompatibilities between switches that have been resolved and the corresponding possible move back operations are listed by Tarzan.
Additional features of Tarzan are:
i) Tarzan offers not only the possibility
to compute any cheapest reconstruction but can also compute reconstruction that are computed
to other criteria. Moreover, a hierarchy of criteria can be defined (in that case optimization
is done first with respect to the most important criterion, then within all found optimimal solutions
optimization is done with respect to the second highest criterion and so forth.
Possible criteria are (in each case minimum or a maximum is possible): cost, number of cospeciations,
number of dupliciations, number of sortings, number of host switches, number of extictions
ii) The maximal number of cheapest
reconstructions that are be computed by Tarzan can be set by the user.
iii) Tarzan can
also list all possible reconstructions which could be interesting for cases where not
too many reconstructions exist.
Steps to use Tarzan:
1. Start Tarzan with java -cp Tarzan-v0.9.jar jungle/Tarzan
2. Create or load new trees in the tree editor window
3. Chose build data structure under Build (this opens a Reconstrcutions
Viewer)
4. Check or change cost table and minimization criteria under Options> in the new Reconstrcutions
Viewer
5. List all or only the cost minimal reconsructions (under View in the Reconstrcutions
Viewer (this opens a reconstruction table)
6. Double click on a row for a contruction you are interested (this opens a Reconstruction Viewer)
More information:
-> Steps to use Tarzan in more detail
-> Functions available in Tarzan windows
Note that the cost assigment used in Tarzan and TreeMap differ. E.g. the follwing cost assignment is often used in the literature. In Tarzan the cospeciation are costs -2, duplications costs are 2, sorting costs are 1, and switch costs are 2. This correspond to the cost assignment c=-1 , d=1, s=1, h=1 (switch) of TreeMap.
More information on how reconstructions are computed and how costs are
defined in Tarzan can be found in:
->Tarzan: Reconstructions and cost model
-> Some problems and hints when using Tarzan
Some other webpages that mention Tarzan:
Phylogeny Programs list that is maintained by Joe Felsenstein
Genamics list of phylogeny analysis software
BiologyBrowser list of sources on evolution and phylogeny
Review of "Tangled Trees" by David A. Morrison, Newsletter 128 of the Australian Botany Society, page 32, 2006.