Abstract

The maximum weighted clique (MWC) problem, as a typical NP-complete problem, is difficult to be solved by the electronic computer algorithm. The aim of the problem is to seek a vertex clique with maximal weight sum in a given undirected graph. It is an extremely important problem in the field of optimal engineering scheme and control with numerous practical applications. From the point of view of practice, we give a parallel biological algorithm to solve the MWC problem. For the maximum weighted clique problem with edges and vertices, we use fixed length DNA strands to represent different vertices and edges, fully conduct biochemical reaction, and find the solution to the MVC problem in certain length range with time complexity, comparing to the exponential time level by previous computer algorithms. We expand the applied scope of parallel biological computation and reduce computational complexity of practical engineering problems. Meanwhile, we provide a meaningful reference for solving other complex problems.

1. Introduction

DNA computing, as a comprehensive discipline, can use DNA biological technologies to solve complex practical engineering problems. In 1994, Adleman [1] made use of DNA molecule operations to solve the Hamiltonian path problem with vertices in time complexity; simultaneously, he also demonstrated the strong parallel ability of DNA computing. In 1995, Lipton [2] figured out the NP-complete satisfiability problem utilizing Adleman’s biochemical experiment. Since then, DNA biological computing attracted more and more interest from different disciplinary scholars. DNA biological computing has three advantages: high parallelism, low energy consumption, and large memory capacity. Many research scholars, designing DNA procedures and algorithms, succeed in solving multifarious kinds of complicated NP-complete problems [321], which promoted development of DNA computing. In order to better apply DNA computing theory to more practical engineering science broad, it is worth trying to solve more intractable problems using the DNA molecular computing. Furthermore, most previous works, relating to DNA computing, focused on how to solve the path search problems that the solutions are continuous head-to-tail ligation edge or vertex sets, so that the possible solutions can be relatively easily represented by DNA strands, while some practical engineering problems, such as maximum weighted clique problem, are discrete set problems without sequentially connected path. So how to represent discrete data on DNA strands is an important key to expand the applied scope of DNA computing.

The maximum weighted clique problem has a wide range of applications in optimal engineering scheme and computational mathematics. In this paper, DNA algorithm, based on the research foundation of Adleman [1] and Lipton [2], is used to get solution of the maximum weighted clique problem. The rest dissertation is organized as follows. In Section 2, the parallel biological computing model is introduced with detailed description. Section 3 uses DNA molecular algorithm to solve the maximum weighted clique problem. Section 4 proves DNA algorithm correctness and feasibility and gets the computation complexity. We come to the conclusions in Section 5.

2. The Parallel Biological Computing Model

DNA is the material basis of biological genetics, which is strung together from deoxyribonucleotides. DNA is formed by four kinds of base composition. These bases are, respectively, called adenine (), guanine (), cytosine (), and thymine (). The permutation and combination of bases store genetic information. An important feature of DNA is that two single strands can form a double strand through complementary base pairing. Moreover, the pairing has high specificity: can only match ; can only be paired with . The length of a DNA single strand is counted by the number of bases. For example, a single strand includes 5 bases; then it is called a .

Based on Adleman [1] and Lipton’s [2] research, DNA biological algorithm operations are described as follows. Corresponding biological operations can be used to get solution of the maximum weighted clique problem. In the parallel biological computing model, we can perform the following operations with given tubes which contain a list of DNA strands.(1): given a test tube , it can get another test tube with the same strands as .(2): given two test tubes and , it can get the compound strands and in and leave empty.(3): given a test tube , it can generate all feasible double strands in by annealing. The products and residues are still stored in after annealing.(4): given a test tube and a list strands set , it can remove all single strands in from and get an another tube with the removed strands.(5): given a tube , it is used to ligate together the strands in .(6): it picks out the shortest length strands into tube from tube , the longest strands into , and the surplus strands are still kept in .(7): given a test tube , it can dissociate every double strand in to couple of single strands.(8): given a tube , it can be used to describe each single strand in .(9): given a test tube and a single strand , it can append at back of each strand in the tube .(10): given a test tube , it discards the strands in tube and leave empty.

Since above operations are realized through the limited biological experimental procedures with DNA strands [18], we can reasonably conclude that each operation is in time complexity.

3. Biological Algorithm for the Maximum Weighted Clique Problem

An undirected simple graph is a pair of vertex set with corresponding vertex positive weight value and edge set . For a vertex subset , if , and can be linked by edge in the graph, then is called a clique of the graph , and simultaneously the clique weight is the sum of vertex weight in the . The solution of maximum weighted clique problem aims to seek a vertex clique of graph with maximal weight sum. For example, the undirected simple graph in Figure 1 is defined as the MVC problem.

In succession, the symbols , , , , , are composed by different single strands having same length, as . Certainly, would be best to choose a small integer which can be determined by the scale of the problem. Then in the following algorithms, we use DNA single strands symbols , to indicate the vertex , with strands symbol for vertex in the vertex subset while for not. Simultaneously, the symbols , are the signal of division between different vertex subsets. We denote DNA singled strands to encode the vertex weight value with length of . For distinguishing some edges belonging to the graph or not, we meantime design DNA strings in the tube if . Let

For a -vertex graph, every vertex subset can be expressed by a -bit binary value. The th bit set to 1 means the vertex in the subset; on the contrary, the th bit set to 0 shows the vertex out of the subset. Taking Figure 1, for example, the vertex subset can be expressed by the binary value 01101. Using the same method, we can represent the vertex subsets of a -vertex simple graph as a series of -bit binary numbers.(1)We generate all possible vertex subsets in graph ;(1);(2);(3);(4);(5);(6).After the above six manipulations, the single strands in tube mean all kinds of vertex subsets. For example, in Figure 1, we have single strands: which denote the vertex subset corresponding to binary value 11011. These operations can be executed with time complexity since every operation can be finished in .(2)Every strand in tube denotes one kind of vertex subset. For the maximum weighted clique problem, solution is one kind of vertex subset that arbitrary two vertices in the subset can be connected by one edge included in the graph . Therefore, we check whether all vertex subsets in are in line with the condition or not. If , we discard the strands indicating that both vertices and are in the same subset. For example, in Figure 1, the singled strands representing the vertex subset () should be discarded for not including the edge in graph to connect vertices and . We choose all possible vertex cliques in graph.For to ,(1);For and to (2);(3)If();Then(4);(5);(6);End forEnd forThrough the above operations, all the single strands in tube represent different vertex clique subsets. Meanwhile, the algorithm includes two “For” clauses, this step is executed in time complexity since each operation can be finished in . (3)The maximum weighted clique problem should be a maximal vertex clique subset in which arbitrary two vertices should be linked by certain edge of the graph . So we select the maximal vertex subset from all kinds of vertex clique subsets. If the vertex is included in the vertex subset, we append additional strand at the end of previous subset strand in order to find the optimum solution strand. For the singled strand (representing the vertex subset ) we append strands at end of the previous strand to This step can be carried out as follows.For to ,(1) ;(2) ;(3);(4) .End forThis step includes one “For” clause; thus it can be finished in time complexity.(4) We select single strands with the longest length from , which represent the solutions of maximum weighted clique problem. For example, in Figure 1, single strands in with the largest length are Consequently, solution of maximum weighted clique problem for Figure 1 is vertex subset with weight sum 15.(1);(2).

4. The Feasibility and Computational Complexity of the Parallel Biological Computing Algorithm

Theorem 1. The maximum weighted clique problem for a -vertex graph can be solved by the biological computing algorithm.

Proof. At first, we get all kinds of the vertex combinational subset in the test tube after Step . For the maximum weighted clique problem, if , vertices and should be not in the same subset. Therefore, basic biological manipulations remove illegal combinations and seek legal ones from solution space strands through the Step . At Step , we append a series of “tails” at the end of the strands which imply the vertex included in the vertex subset. Owing to the length of strands , the longest length strands in the pool mean the solutions of maximum weighted clique problem. Besides, we can search and get the solution at the last step.

Theorem 2. The solutions of maximum weighted clique problem for a -vertex graph can be solved in time complexity using DNA molecules computing.

Proof. The parallel biological computing algorithm can be entirely executed in finite time complexity such as Steps and in , Step in , and simultaneously Step in time complexity. The total algorithm complexity is as below: In conclusion, we can get the solutions of maximum weighted clique problems with -vertices in time complexity.

Theorem 3. Solution strands to the maximum weighted clique problem with -vertices can be found in the finite length range.

Proof. After Step , the singled strands in tube denote all possible vertex subsets. These strands can be described as follows:At Step , the single strands in mean all possible vertex clique subsets. We design the fixed length strands of , , , and , for is defined as strands assemblage after Step . Then can be described: Appending the strands or not is decided whether there exists vertex information strands on the previous strands. Due to the fact that the number of vertex in sunset is between and , so after “append” operation, the strands is also in a finite length range: For the maximum vertex clique problem, the length of solution strands is between and . Therefore, we can get the solution in appropriate length range at Step .

5. The Detailed Approach and Walkthrough of the Biological Computing Algorithm

Taking Figure 1 as example, we describe operation result of each step. Due to the fact that biological computing algorithm depends on basic biochemical DNA molecules reactions which may cause errors in the process, it is an important matter to make biological computing more reliable by means of the DNA molecular sequence design. To have a better performance in hybridization reactions, we follow [22] to accomplish the sequence design. For the problem of Figure 1, the program generates 3-base random sequences to represent symbols , , , and . If the generated DNA sequence fails to pass any of the constraints, the program will regenerate a new DNA sequence. If the constraints are satisfied, the new DNA sequences are accepted. If all the DNA strands satisfy the constraints, the program has then succeeded and these sequences would be the outputs. The corresponding vertex symbol sequences are shown in Table 1. In accordance with the above design, we can get all kinds of symbol representations of vertex subsets in Table 2 after Step . Step discards the inappropriate vertex combinatorial sequences and retains the vertex clique sequences in Table 3. At Step , we append the corresponding weighted sequences which are showed in Table 4. Through the “Sort” operation at Step , we find the optimal solution to the maximum weighted clique problem of Figure 1 in Table 5.

6. Conclusions

In this paper, we present a parallel computing algorithm to solve the maximum weighted clique problem based on biological operations. Due to the fact that DNA biological computing has some advantages including high parallelism, low energy consumption, and large memory capacity, comparing to electronic computers low speed and limited memory, the method of DNA computing has attracted more and more attention. Besides, compared with the previous algorithms, our proposed algorithm has the following features: we utilize fixed length DNA strands to generate the solution strands of the problem, the algorithm actually has lower error rate in hybrid operations; the time cost of algorithm and solution strands length increase in linear proportion with the expansion of instance scale. For an undirected simple -vertex graph, the parallel biological computing algorithm executes in time complexity for the maximum weighted clique problem, having lower computational complexity than previous algorithms in exponential level. Although operations in our paper are on the basis of a theoretical model, the capacity to executive complicated operations in algorithm could help us understand more about the nature of computing and promote the better and faster development of biocomputing, more conducive for us to solve complex practical engineering problems.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research was supported by the Open Research Fund of State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, China Institute of Water Resources and Hydropower Research (Grant no. 2014ZY05), and “12th Five-Year Plan” to Support Science and Technology Project (Grant no. 2012BAB04B02). The project was also supported by CNSF (Grant nos. 61272098, 51409050, and 51108376), SSF (Grant no. 2014JQ7231), and Doctor fund of Shanghai Ocean University (A-2400-12-0000351).