BioMed Research International

Volume 2016 (2016), Article ID 4236858, 7 pages

http://dx.doi.org/10.1155/2016/4236858

## Constructing Phylogenetic Networks Based on the Isomorphism of Datasets

^{1}School of Computer Science, Inner Mongolia University, Hohhot 010021, China^{2}Department of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China

Received 31 May 2016; Accepted 30 June 2016

Academic Editor: Yungang Xu

Copyright © 2016 Juan Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK, and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topology and construct a network for every dataset. For any one dataset , we can compute a network from the network representing the simplest dataset which is isomorphic to . This process will save more time for the algorithms when constructing networks.

#### 1. Introduction

The evolutionary history of species is usually represented as a (rooted) phylogenetic tree, in which one species has only one parent. Actually, the evolution of species has caused reticulate events such as hybridizations, horizontal gene transfers, and recombinations [1–5], so species may have more than one parent. Then, the phylogenetic trees cannot describe well the evolutionary history of species. However, phylogenetic networks can represent the reticulate events, and they are a generalization of phylogenetic trees. Phylogenetic networks can also represent the conflicting evolution information that may be from different datasets or different trees [6–9].

Phylogenetic networks can be classified into unrooted [10–12] and rooted networks [4, 13–19]. An unrooted phylogenetic network is an unrooted graph whose leaves are bijectively labelled by the taxa. A rooted phylogenetic network is a rooted directed acyclic graph (DAG for short) whose leaves are bijectively labelled by taxa [20–22]. The rooted phylogenetic networks have been studied widely for representing the evolution of taxa, as evolution of species is inherently directed. The paper will study relevant properties of the rooted phylogenetic networks constructed from the rooted trees.

The algorithms constructing rooted phylogenetic networks from rooted phylogenetic trees are mainly classified into three types: the cluster network [17] based on the Hasse diagram; the galled network [16] based on the seed-growing algorithm; the CASS [23], the LNETWORK [24], and the BIMLR [25] based on the decomposition property of networks. In particular, the third type of methods (CASS, LNETWORK, and BIMLR) can construct more precise networks than the other methods. In the following, unless otherwise specified, we refer to rooted phylogenetic networks as networks.

Let be a set of taxa. A proper subset of (except for both and ) is called a cluster. A cluster is trivial if ; otherwise, it is nontrivial. Let be a rooted phylogenetic tree on ; if there is an edge in such that the set of taxa which are descendants of equals , we say that represents . Figure 1 shows two rooted phylogenetic trees and and all nontrivial clusters represented by and . Here, all trivial clusters are not listed. Given a network and a cluster , when just connecting one incoming edge and disconnecting all other incoming edges for each reticulate node (i.e., its incoming edges >1), if there is a tree edge (i.e., incoming edge of ) in such that the set of taxa which are descendants of equals , we say that represents in the softwired sense. On the other hand, if there is a tree edge in such that the set of taxa which are descendants of equals , we say that represents in the hardwired sense.