Abstract

The Chinese postman problem is a classic resource allocation and scheduling problem, which has been widely used in practice. As a classical nondeterministic polynomial problem, finding its efficient algorithm has always been the research direction of scholars. In this paper, a new bioinspired algorithm is proposed to solve the Chinese postman problem based on molecular computation, which has the advantages of high computational efficiency, large storage capacity, and strong parallel computing ability. In the calculation, DNA chain is used to properly represent the vertex, edge, and corresponding weight, and then all possible path combinations are effectively generated through biochemical reactions. The feasible solution space is obtained by deleting the nonfeasible solution chains, and the optimal solution is solved by algorithm. Then the computational complexity and feasibility of the DNA algorithm are proved. By comparison, it is found that the computational complexity of the DNA algorithm is significantly better than that of previous algorithms. The correctness of the algorithm is verified by simulation experiments. With the maturity of biological operation technology, this algorithm has a broad application space in solving large-scale combinatorial optimization problems.

1. Background

The Chinese postman problem (CPP) was first raised by Guan Meigu [1] in the 1960s, and it was based on the question of “how should a postman choose a route so that he can walk all the streets where he is responsible for delivering the letter and the shortest distance traveled?” It arises in numerous applications in many fields, such as municipal solid waste collection [2], winter gritting [3], material distribution and transportation [4], and intelligent transportation [5]. Over the past half century, many branches have been formed in the study of the Chinese postman problem, such as the directed Chinese postman problem (DCPP) [6], the mixed Chinese postman problem (MCPP) [7], the windy postman problem (WPP) [8], and the hierarchical Chinese postman problem (HCPP) [9]. Because of its wide application, the Chinese postman problem has been extensively studied, including the design and improvement of calculation methods, the deformation of the problem, and its application in various aspects. In 1965, Edmonds [10] gave the definition of the Chinese postman problem and then proposed an algorithm to solve the shortest distribution route by first finding the minimum weight matching of any graph to construct Euler graph. Later, in 2002, Yin et al. [11] used molecular biology technology to solve the Chinese postman problem. In 2003, Jiang and Li [12] solved the problem in an undirected graph with an evolutionary algorithm and achieved good results. Their algorithm can also be extended to the directed Chinese postman problem. In 2005, Fei and Cui [13] proposed a new search algorithm CPDPA based on the idea of the dynamic planning decision-making process, which realized the dynamic planning of the Chinese postman problem for the first time, and effectively solved the optimal path problem of the NK model. In 2009, Li and Wang [14] proposed a new solution algorithm, which used the polymerase chain reaction technology to eliminate the nonsolution, and then combined the surface-based DNA computing method with fluorescence labeling technology and finally extracted the optimal solution from all feasible solutions. In 2011, Zoraida [15] used the thermodynamic properties of DNA and the chemical reaction operation with other biomolecules to improve the accuracy of the algorithm experiment. According to the specific operation steps, the optimal solution of the Chinese postman problem was finally obtained. In 2016, Lee [16] created the Euler circle algorithm for the Chinese postman problem. They selected the edge with the least weight from the adjacent edge of each vertex to get the minimum spanning tree and finally got the shortest path edge of the odd order vertex.

In addition, the arc routing optimization problem similar to the Chinese postman problem has also been widely concerned. Nossack et al. [17] introduce the first modeling of the windy rural postman problem with zigzag option and time windows, discuss two mixed-integer programming formulations for the problem at hand, and suggest exact solution approaches. Meng et al. [18] investigate a class of CTSP, called serial CTSP (S-CTSP), and present a population-based incremental learning (PBIL) approach to the problem. By adding a powerful local search operation, 2-opt, to the algorithm, they can further enhance its search ability. Wang and Lu [19] propose a memetic algorithm with competition (MAC) to solve the capacitated green vehicle routing problem (CGVRP). Carmine et al. [20] develop a two-stage solution approach for the Directed Rural Postman Problem with Turn Penalties (DRPP-TP) and describe an integer linear program that is combined with a local search algorithm. This combination produces high-quality solutions to the DRPP-TP in a reasonable amount of computing time. To solve the Chinese and windy postman problem with variable service costs (CPPVSC), Keskin et al. [8] proposed two heuristics for searching the solution. Xu et al. [21] develop a Delaunay-triangulation-based Variable Neighborhood Search (DVNS) algorithm to solve a general colored traveling salesman problem (GCTSP). Wang et al. [22] firstly introduce a multiobjective MTPDPTW-MP (MO-MTPDPTWMP) with three objectives to better describe the real-world scenario. Then, a multiobjective iterated local search algorithm with adaptive neighborhood selection (MOILS-ANS) is proposed to solve the problem.

In real life, as a classical optimization problem, the Chinese postman problem is widely used in practice. Therefore, it is of great practical value and significance to design an efficient algorithm to solve this problem. But with the increase of the number and scale of the problem, the computational efficiency and storage limit of the traditional algorithm are greatly limited, and it is more and more impossible to solve the problem in the polynomial time level. As a new intelligent algorithm, DNA computing has both the advantages of parallel computing and the amazing storage capacity, which has become one of the potential tools to solve the problem efficiently.

The Chinese postman problem is the mail delivery distance in a certain area. The postman starts from the post office every day, goes through all the streets in the area, and then returns to the post office. The question is how he should arrange the route to deliver the letters so as to minimize the total distance. In the language of graph theory, the Chinese postman problem is to find a generalized Euler tour that satisfies the sum of the weight of all edges and the sum is minimized, and the starting (ending) vertex is fixed. Given an undirected connected graph , represents the vertex set and represents the set of arcs (connecting by the vertices in ), and and . Let be the cost (distance) matrix associated with and . In general, we assume that is both starting and ending node, and the a generalized Euler tour is . The mathematical definition of the Chinese postman problem is as follows:

The optimal route of the Chinese postman problem generally has the following limitations:(1)Route is continuous, starting, and returning to the same designated vertex(2)Each edge has been traversed at least once by the route(3)The weight sum of the route is the minimum

We use Figure 1 as an example for the Chinese postman problem. The optimal route starts from the vertex , then passes through all the edges at least once, and finally returns , and the sum of the weight sum of the route is minimized. Through mathematical calculation, the optimal route to meet the requirements can be obtained as follows:

In this paper, we will use the DNA algorithm of the Adleman–Lipton model to solve the Chinese postman problem. The feasibility and computational complexity of the algorithm are analyzed, which proves that the algorithm has higher computational efficiency, and the correctness of the algorithm is also verified by simulation experiments.

The structure of this paper is as follows. Section 2 introduces the research status of DNA computing and the basic knowledge of the Adleman–Lipton model. Section 3 uses a parallel DNA computing algorithm to solve the Chinese postman problem by different operations. The feasibility and computational complexity of the algorithm are analyzed in Section 4. Section 5 gives the simulation results of the algorithm. Finally, in Section 6, we make a summary evaluation of the algorithm and put forward the future research direction.

2. Biocomputing and Modeling

With the rise and development of molecular biology, the structure and function of biomacromolecules have been proposed gradually. It is found that, in the process of interaction of biomacromolecules, following the laws of biology, chemistry, and physics, a computing process like information transfer and processing process will be formed, accompanied by the emergence of biomacromolecule computing. The concept of biocomputing was first introduced by the famous “salesman problem”. Researchers solved this problem with special DNA tubes in seven nodes, opening a new era of DNA computing research. In 1994, Adleman [23] solved a well-known computational problem with the DNA computing method, that is, the directed Hamiltonian circuit problem. With the increase of variables, it becomes more and more difficult to solve the problem, but DNA computing can easily solve the problem. Adelman’s experiments prove the feasibility of DNA for specific purpose computing and verify the advantages of molecular computing in parallel, high-density compression solutions. Its novelty is that it creates a precedent for applying mathematical problems to molecular level computing and encourages the application of DNA computing in a broader field.

Subsequently, DNA computing has also been used by many scientists to solve various nondeterministic polynomial (NP) mathematical problems, and many typical biological computing models have been established at the same time. For example, in 1995, Lipton [24] solved the satisfiability (SAT) problem through DNA experiments, proving that DNA-based computing has a huge efficiency improvement compared with traditional electronic computers. In 1997, Ouyang et al. [25] proposed a method to solve the biggest group problem by using molecular biology technology. They map all possible clusters into a set of binary numbers to form a data pool and then delete and classify it. The effective solution of this problem provides strong evidence for DNA computing to solve NP-complete problem in the future. At present, DNA computing has become a hot topic in the research of new computing models. Through different operation and control technologies for DNA molecules, new algorithms can be formed to provide strong technical support for many problems without effective solutions [2639]. For the Chinese postman problem, with the increase of data volume and scale, the traditional algorithm will be more difficult to solve it. Therefore, we need to propose and try some high-performance computing algorithms; DNA algorithm is one of them.

DNA is a substance that stores genetic information in living cells. DNA is a very thin and long polymer compound composed of a series of deoxynucleotide chains, which in turn are composed of deoxyribose, phosphate, and nitrogen-containing bases. DNA algorithm mainly uses the bases in it, which contains 4 kinds of bases, namely, adenine (), guanine (), cytosine (), and thymine (). The four types of bases have Watson–Crick complementarity, that is, two opposing bases meet the complementary characteristics, where complementary to and complementary to . Billions of years of evolutionary cells have created and perfected enzymes with special functions that can copy information from DNA molecules and pass it on to other DNA molecules. The Adleman–Lipton model uses basic biological experimental operations and uses the strings of four different bases , , , and in DNA as information means to express, and different information is represented by different codes. This provides storage capacity for large amounts of information. A similarly biologically different enzyme is used for different operations to achieve a given goal. Although the fields are different, the principles are the same. In computation, we consider that the length of a single-stranded DNA is the number of nucleotides composed of a single strand. Therefore, if a single-stranded DNA contains 30 nucleotides, we consider the strand to be 30 in length and called 30 mer.

In the Adleman–Lipton model, for given test tubes (experimental test tubes) which contain a finite number of DNA strands consisting of , , , and , we can perform the following operations:(1): for two given test tubes and , it stores the union and in and leaves empty.(2): for a given test tube , it produces all feasible double strands in .(3): for a given test tube , it dissociates each double strand in into two single strands.(4): for a given test tube and a given set of strings , it removes all single strands containing the string from and produces a test tube with the removed strands.(5): for a given test tube , it discards the strands in tube .(6): for a given test tube , it produces a test tube with the same strands as .(7)Cleavage : given a test tube and strings , it cuts every strand containing into three strands:(8): it moves the shortest strands of the test tube to the tube and the longest to the test tube , and the rest is still stored in the test tube .(9): for a given tube , the operation is used to describe a single molecule, which is contained in the tube . Even if contains many different molecules and each of which encodes a different set of bases, the operation can give an explicit description of exactly one of them.

Since the above nine operations can be realized by a series of continuous operations on the biological chains, it can reasonably be deduced that the complexity of each operation is [40]. The assumption has been used in previous studies to analyze the complexity of DNA computing [4146]. We use these operations to implement algorithms to solve the Chinese postman problem.

3. Algorithm Design for the Chinese Postman Problem

In this section, the basic idea of the algorithm is introduced first, and then the symbols in the algorithm are set uniformly. Finally, Figure 1 is taken as an example of the Chinese postman problem to demonstrate the algorithm.

3.1. Preliminary Ideas

The basic idea to solve the Chinese postman problem by DNA computing is as follows: first, the corresponding vertices and edges of the Chinese postman problem are represented by specific symbols; then, all possible DNA chains in the problem are generated by biochemical reactions, and the chains that meet the constraint conditions of the problem are screened. Finally, through search and recognition, the optimal solution corresponding to the biological chain is obtained, and then, the optimal route of the problem is obtained.

Correspondingly, the algorithm is divided into four steps:Step 1: generate all possible random routesStep 2: filter out routes that start at a fixed node (post office) and end at the same nodeStep 3: select all routes that pass through all edges at least onceStep 4: find the shortest generalized Euler closed loop, get the shortest route, and output the optimal solution

3.2. Graph Theory Expression with DNA Strands

In this paper, in order to clarify and standardise the expression of our algorithm, it is necessary to define and explain the symbols and notations in the algorithm. Therefore, we use the symbols in Table 1 to illustrate.

In DNA computing, as the optimal path is mainly determined by the length of the feasible solution strands, it is necessary and critical to design the length of the problem element information strands. So, in this paper, we set mer and mer, where represent positive integers. Obviously, the length of the single-stranded DNA in the experiment depends largely on the size of the problem involved. At the same time, we use their complementary strand to synthesize double strands. Generally, we assume that node is the beginning and ending points of the route for the Chinese postman problem.

3.3. Algorithm Description

For the Chinese postman problem, we set the nodes as vertices , the streets as edges , and the weight of edge is .

Let us start with the initial tubes , , and :

We use the Chinese postman problem in Figure 1 (beginning and ending nodes are ) as an example to demonstrate the process of the algorithm. Let

In Figure 1, the edge set matrix composed of binary variables is as follows:

After the above seven operations, we generate DNA strands to represent different postman routes. Taking the problem in Figure 1 as an example, the chain represents the route .

For example, in Figure 1, route should be excluded because it does not pass through edges and . After the operations, all DNA single strands in the tube represent the routes start and end vertex and pass through all edges at least once.

Finally, we can get the exact solutions strands for the Chinese postman problem.

3.3.1. Generate a Data Pool and Filter Out the Routes That Start and End with the Specified Vertex

is provided in Algorithm 1.

We can get all the possible routes strands (from to ) of the Chinese postman problem in a test tube.
(1-1) ;
(1-2) ;
(1-3) ;
(1-4) ;
(1-5) ;
(1-6) ;
(1-7) .
3.3.2. Path through All Edges

is provided in Algorithm 2

In the Chinese postman problem, each edge should be traversed at least once. In other words, a route cannot be a feasible path if it does not go through all the edges. We judge the information of the edge that the route passes by and screen out the routes that do not meet the restriction.
 For to
  For to
(2-1)
(2-2-1) ;
(2-2-2) ;
(2-2-3) ;
(2-2-4) ;
(2-2-5) ;
(2-2-6) ;
(2-2-7) ;
  Else
(2-3) .
  End for
 End for
3.3.3. Eliminate the Influence of Vertex Chain Length

is provided in Algorithm 3

The feasible solution chains contain the information of the chain length passing through the vertex. The optimal solution based on the total chain length will be affected by the number of vertices the route passes through. Therefore, it is necessary to exclude the vertex chain information contained in the original feasible chain.
 For to
(3-1) ;
(3-2) ;
(3-3) ;
(3-4) ;
(3-5) ;
(3-6) ;
(3-7) ;
(3-8) ;
(3-9) ;
(3-10) .
 End for
 In Figure 1, the edge route can be coded as .
3.3.4. Reconfirm That the Route Passes All Edges At Least Once

is provided in Algorithm 4

Because the complementary chain of vertex is used to generate the weight information chain of the edge, some paths that do not pass through all the edges will also be doped in tube , so the chains should be excluded. We judge the path passing through the edges again and screen out the feasible path chains satisfying that all the edges pass through at least once.
 For to
  For to
(4-1)
(4-2-1) ;
(4-2-2) ;
(4-2-3) ;
(4-2-4) ;
(4-2-5) ;
(4-2-6) ;
(4-2-7) ;
  Else
(4-3) .
  End for
 End for
3.3.5. Get the Solution

is provided in Algorithm 5

In many different routes, the optimal solution of the Chinese postman problem has the minimum weight value. We search for the shortest DNA strand in the last test tube to indicate the result of the Chinese postman problem.
(5-1) ;
(5-2) .

4. Feasibility and Time Complexity of the Proposed DNA Algorithm

We should prove that the proposed DNA algorithm can find the DNA chain corresponding to the optimal solution in a certain length range and solve the Chinese postman problem in a certain time complexity.

Theorem 1. The proposed DNA algorithm can solve the Chinese postmen problem.

Proof. We use DNA strands with different symbols to express the corresponding nodes and edges of the problem and use biochemical reactions with complementary strands to generate all possible routes strands in the test tube. Therefore, after Algorithm 1, each DNA strand represents a feasible path starting from a specified fixed vertex and ending it. Another constraint of the feasible solutions is that each edge needs to pass at least once, so we screen out feasible chains that pass through all specified edges by Algorithm 2. The chains after Algorithm 2 can be described asIn order to select the optimal solution in a feasible chain, the weight of the feasible chain needs to be compared. Because the original chain contains information passing through the vertex, we filter out the vertex information in the chain through Algorithm 3. Algorithm 4 excludes the infeasible solution chains of doping caused by the synthesis of complementary chains in Algorithm 3. At this time, the chains which pass through all edges at least once and only contain edge weight information are retained in test tube . After the step, the chain can be expressed as follows:In addition, we set mer and mer. The length of chain is given byTherefore, the length of chain changes in the same direction as the weight of the edge through the path it represents. In Algorithm 5, we obtain the optimal solution to the problem by comparing the lengths of the DNA chains.

Theorem 2. The Chinese postman problem can be solved at the time complexity level.

Proof. Since each biochemical operation is completed in time complexity [4146], according to the steps we designed, the Chinese postman problem can be performed in a limited number of steps and time complexity. The time complexity of the DNA algorithm is as follows:Therefore, according to the analysis, the algorithm can solve the Chinese postman problem at the time complexity level.

5. Simulation Experiments of DNA Algorithms

In order to verify the feasibility of the algorithm, simulation operations are very necessary and meaningful. The calculation of DNA depends on the accuracy of biochemical molecular operation; otherwise, it will lead to the accumulation and expansion of errors in biochemical reactions. At the same time, the operations should exclude the DNA sequences that may promote the unexpected hybridization of the probe library. Therefore, the design of appropriate DNA sequence is the necessary basis to ensure the accuracy of DNA calculation. In order to achieve these objectives, the experiments comply with the seven limitations of DNA manipulation proposed in [47]. These limitations include the following: no more than 7 base pairs of any 8 base pairs between library sequences and their own or any other library sequences; no occurrence of 5 or more consecutive identical nucleotides appear in all library and probe sequences; no more than 7 matches of any probe sequence with any 8 base pairs of any library sequence; the probe will only bind weakly where it is not intended to bind, and so on. Similarly, the experiments use the sequence design method in references [40, 4752].

In this paper, the computational molecular biology tool Biopython is used as the system development platform, and Braich’s program is used to generate “appropriate DNA sequences” suitable for biological computing algorithm, which can be used to solve the CPP. Braich’s program and other simulations are running on a Windows 7 machine, with an Intel Core-XP CPU and 8 GB main memory, and the compiler is Python 3.6. Our modified program is used to construct a random sequence of 4 bases for each bit of the library and to check whether the library chain meets the seven constraints of DNA sequence [47]. When generating one DNA chain, the first step is to determine whether the chain satisfies the restrictions specified in references [40, 4752]. If the produced DNA sequence does not meet the restrictions, the program will continue to generate another new sequence. When these restrictions can be met, the sequence is accepted for subsequent biochemical reactions. Using this method, we can get the “appropriate DNA sequences” which meet the restriction conditions to improve the accuracy of biochemical reactions.

Therefore, taking the Chinese postman problem in Figure 1 as an example, the program generates random sequences to form , , , , and ( mer and mer) as shown in Table 2. Table 3 demonstrates the DNA node sequence composed by Braich’s methods. In Table 4, the enthalpy, entropy, and free energy of the binding of each probe to the corresponding region in the chains have also been calculated. Their average deviation and standard deviation levels are also shown in Table 4. Then, DNA solution strands for the Chinese postman problem have been found. Table 5 demonstrates the DNA solution strands for the Chinese postman problem.

6. Conclusions

Traditional silicon electronic computers can deal with simple types of computing problems, but they cannot deal with complex big data problems efficiently due to the limitations of the chip’s microprocessing ability and other factors. DNA molecules not only have the natural characteristics of specific hybridization and miniaturization but also have the ability of high-density information storage and powerful parallel computing, which makes DNA computing having natural advantages in dealing with difficult problems. In this paper, an algorithm based on the Adleman–Lipton model is proposed to solve the Chinese postman problem. DNA algorithm has the advantages of simple interpretation, simple coding, and high precision. Secondly, the computational complexity of the algorithm increases linearly with the scale of the problem. Due to the good parallelism of DNA computing, this method solves the Chinese postman problem with vertices in time complexity. Compared with the previous algorithms, it has made great progress and improvement. For example, Edmonds [10] and Johnson et al. [53] give a perfect matching algorithm about the minimum weight of CPP with the time complexity. Kaj Holmberg [54] proposed a heuristic idea and improved it with the Frederickson method; the time complexity of the method is also . Kundeti et al. [55] showed that the cyclic CPP on bidirected graphs can be solved without reducing it to bidirected flow with time complexity. Gutin et al. [56] prove that the MCPP parameterized by the number of arcs is also fixed-parameter tractable, and solve it in time , where is the number of vertices in . Nilofer and Rizwanullah [57] propose a combination of minimum matching algorithms, which create an Euler walk and Dijkstra algorithm to calculate short paths, and the computational complexity of the algorithms is between and . The comparison results are shown in Table 6. By comparing the results, we can see that different algorithms will have a great impact on the computational efficiency due to the differences in computational complexity when solving large-scale Chinese traveling salesman problems. Especially with the increase of , the difference will become more clear. Therefore, the improvement of computational efficiency is obviously conducive to the solution of large-scale problems.

Moreover, our algorithm uses DNA as the basic element and unit of calculation, which has the characteristics of low energy consumption and large storage capacity compared with other algorithms. As DNA molecules are very tiny, a cubic meter of DNA solution can store nearly one trillion binary data. In addition, the energy consumption required for the reaction is very low. To complete the same computing operation, the energy consumption is one billion of that of supercomputers. It is conceivable that, with the expansion of the scale of the problem, there will be enough molecules to represent the element information in the problem. At the same time, the parallel chemical reaction of molecules has obvious advantages over the serial operation of computer. The parallel efficiency of DNA computing is amazing. The number of parallel operations of DNA strands can reach 10 to the 14th power per second. Of course, these advantages are based on the further development of DNA molecular manipulation technology. This also leads to the fact that although DNA computing has the advantage in computing time of molecular reactions, the preparation work of algorithm operation and the preparation work between experiments are seriously wasted. But from the perspective of development, with the maturity of DNA operation technology, the advantages of high parallelism, huge storage capacity, and low energy consumption response of DNA computing will be brought into play in the future. Now, DNA computing algorithms are mainly used to solve NP problems with exponential feasible solutions. Especially in the era of big data in the future, the scale of the problem to be solved, the amount of data and the number of feasible solutions are huge. In this case, ordinary computing algorithms are powerless. DNA computing just takes advantage of its own advantages to play its own advantages in such problems. In addition, the effective solution of this problem has certain reference significance and application performance for resource planning.

However, our algorithm also has some shortcomings. Our algorithm to solve this problem is only compared with the classical algorithm, and the conclusion of comparison is relatively limited. The reason is that the main purpose of this paper is to show the characteristics and advantages of the algorithm in reducing the computational complexity of the problem algorithm, so we only compare it with the previous literature studies that have obviously proposed the computational complexity. The vast majority of literature studies mainly reveal the efficiency performance of the algorithm, while fewer literature studies clearly give the analysis of algorithm complexity. Therefore, our comparative literature studies are limited. On the other hand, DNA computing algorithm is mainly based on the biological DNA molecular chemical reaction to achieve the output of the algorithm function. Because the computer programs are executed in sequence, it is impossible to realize the parallel chemical reaction operation of molecules in the DNA algorithm. Therefore, the Python program designed by our simulation analysis can only realize the biological experiment results we designed. Compared with other algorithms (ant colony algorithm [33], particle swarm optimization algorithm [34], collaborative optimization algorithm [35], and genetic algorithm [36]), DNA algorithm is not ideal because of the different computing mechanism. Therefore, in the past papers on DNA computing, most of them explained the feasibility of the algorithm from the theoretical point of view [40, 4345, 47, 48]. Of course, the problems raised by the reviewers are also the research direction of our efforts in the future, and we hope that the research work in this area can make a breakthrough.

Because the various DNA operations used in the algorithm model can already be operated in the laboratory, the method can be put into practical application. Now, DNA computer has broadened people’s vision, accelerated the development of bioinformatics, brought corresponding problems, and put forward challenges to many fields. We believe that more efficient DNA algorithms can be used to solve more problems in future research.

Data Availability

All data included in this study are available upon request by contact with the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The project was supported by the National Natural Science Foundation of China (Grant no. 11701363). It was also funded by the Research Start-up Project of Wenzhou Business School (RC202002).