This tutorial gives a basic introduction for constructing
phylogenetic networks and adding parameters to trees or networx objects
using phangorn
(K. P. Schliep 2011) in R. Splits graphs
or phylogenetic networks are a useful way to display conflicting data or
to summarize different trees. Here, we present two popular networks,
consensus networks (Holland et al. 2004)
and Neighbor-Net (Bryant and Moulton
2004).
Trees or networks are often missing either edge weights or edge support
values. We show how to improve a tree/networx object by adding support
values or estimating edge weights using non-negative Least-Squares
(nnls).
We first load the phangorn package and a few data sets we use in this vignette.
library(phangorn)
data(Laurasiatherian)
data(yeast)
A consensusNet (Holland et al. 2004) is a generalization of a consensus tree. Instead of only representing splits (taxon bipartitions) occurring in at least 50% of the trees in a bootstrap or MCMC sample one can use a lower threshold and explore competing splits. Note that, in its basic implementation used here, the consensusNet edge lengths are proportional to the frequency of the corresponding splits in the provided list of trees.
The input for consensusNet
is a list of trees i.e. an
object of class multiPhylo
.
set.seed(1)
<- bootstrap.phyDat(yeast, FUN = function(x)nj(dist.hamming(x)),
bs bs=100)
<- nj(dist.hamming(yeast))
tree par("mar" = rep(1, 4))
<- plotBS(tree, bs, "phylogram") tree
<- consensusNet(bs, .3)
cnet plot(cnet, show.edge.label=TRUE)
In many cases, consensusNet
will return more than two
incompatible (competing) splits. This cannot be plotted as a planar
(2-dimensional) graph. Such as situation requires a n-dimensional graph,
where the maximum number of dimensions equals the maximum number of
incompatible splits. For example, if we have three alternative
incompatible splits: (A,B)|(C,D) vs. (A,C)|(B,D) vs. (A,D)|(B,C), we
need a 3-dimensional graph to show all three alternatives. A nice way to
get still a good impression of the network is to plot it in 3D.
plot(cnet, "3D")
# rotate 3d plot
play3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)
# create animated gif file
movie3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)
which will result in a spinning graph similar to this
The function neighborNet
implements the popular method
of Bryant and Moulton (2004). The
Neighbor-Net algorithm is essentially a 2D-version of the Neighbor
joining algorithm. The Neighbour-net is computed in two steps: the first
computes a circular ordering of the taxa in the data set; the second
step involves the estimation of edge weights using non-negative
Least-Squares (nnls).
<- dist.hamming(yeast)
dm <- neighborNet(dm)
nnet par("mar" = rep(1, 4))
plot(nnet)
The advantage of Neighbor-Net is that it returns a circular split
system which can be always displayed in a planar (2D) graph. The
rendering of the networx
is done using the the fantastic
igraph package (Csardi and Nepusz
2006).
We can use the generic function addConfidences
to add
(branch) support values from a tree, i.e. an object of class
phylo
to a networx
, splits
or
phylo
object. The Neighbor-Net object we computed above
provides no support values. We can add the support values from the tree
we computed to the splits (edges) shared by both objects.
<- addConfidences(nnet, tree)
nnet par("mar" = rep(1, 4))
plot(nnet, show.edge.label=TRUE)
Analogously, we can also add support values to a tree:
<- rNNI(tree, 2)
tree2 <- addConfidences(tree2, tree)
tree2 # several support values are missing
par("mar" = rep(1, 4))
plot(tree2, show.node.label=TRUE)
Consensus networks, on the other hand, provide primarily information about support values corresponding to a split, but no information about the actual difference between the taxon bipartitions defined by that split. For example, one may be interested how the alternative support values correspond with the actual genetic distance between the involved taxa. Given a distance matrix, we can estimate edge weights using non-negative Least-Squares, and plot them onto the consensusNet splits graph.
<- nnls.networx(cnet, dm)
cnet par("mar" = rep(1, 4))
plot(cnet, show.edge.label=TRUE)
The functions read.nexus.networx
and
write.nexus.networx
can read and write nexus files for or
from SplitsTree (Huson and Bryant 2006).
Check-out the new vignette IntertwiningTreesAndNetworks (K. Schliep et al. 2017) for additional
functions, examples, and advanced application.
sessionInfo()
## R version 4.2.2 Patched (2022-11-10 r83330)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
##
## locale:
## [1] LC_CTYPE=de_AT.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_AT.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=de_AT.UTF-8
## [7] LC_PAPER=de_AT.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] phangorn_2.11.1 ape_5.6-2
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.9 bslib_0.4.2 compiler_4.2.2 jquerylib_0.1.4
## [5] highr_0.10 tools_4.2.2 digest_0.6.31 jsonlite_1.8.4
## [9] evaluate_0.20 lifecycle_1.0.3 nlme_3.1-161 lattice_0.20-45
## [13] pkgconfig_2.0.3 rlang_1.0.6 Matrix_1.5-3 fastmatch_1.1-3
## [17] igraph_1.3.5 cli_3.6.0 rstudioapi_0.14 yaml_2.3.6
## [21] parallel_4.2.2 xfun_0.36 fastmap_1.1.0 stringr_1.5.0
## [25] knitr_1.41 generics_0.1.3 vctrs_0.5.1 sass_0.4.4
## [29] grid_4.2.2 glue_1.6.2 R6_2.5.1 rmarkdown_2.20
## [33] magrittr_2.0.3 codetools_0.2-18 htmltools_0.5.4 quadprog_1.5-8
## [37] stringi_1.7.12 cachem_1.0.6