Mathematical Problems in Engineering

Volume 2017, Article ID 9053238, 12 pages

https://doi.org/10.1155/2017/9053238

## Social Network Community Detection for DMA Creation: Criteria Analysis through Multilevel Optimization

^{1}Laboratory of Computational Hydraulics (LHC), Unicamp, Av. Albert Einstein 951, Campinas, SP, Brazil^{2}Berliner Wasserbetriebe and Amtsgericht Charlottenburg, HRA 30951 B, 10864 Berlin, Germany^{3}FluIng, Universitat Politècnica de València, Camino de Vera, S/N (Edificio 5C-Bajo), 46022 Valencia, Spain

Correspondence should be addressed to Bruno M. Brentan; moc.liamg@80liviconurb

Received 18 October 2016; Revised 18 January 2017; Accepted 26 January 2017; Published 20 February 2017

Academic Editor: Tamas Kalmar-Nagy

Copyright © 2017 Bruno M. Brentan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Management of large water distribution systems can be improved by dividing their networks into so-called district metered areas (DMAs). However, such divisions must be based on appropriated technical criteria. Considering the importance of deeply understanding the relationship between DMA creation and these criteria, this work proposes a performance analysis of DMA generation that takes into account such indicators as resilience index, demand similarity, pressure uniformity, water age (and thus water quality), solution implantation costs, and electrical consumption. To cope with the complexity of the problem, suitable mathematical techniques are proposed in this paper. We use a social community detection technique to define the sectors, and then a multilevel particle swarm optimization approach is applied to find the optimal placement and operating point of the necessary devices. The results obtained by implementing the methodology in a real water supply network show its validity and the meaningful influence on the final result of, especially, elevation and pipe length.

#### 1. Introduction

The frequent disorderly spatial expansion of cities, mainly in developing countries, compels water utilities to rethink their management practice, aiming at highly efficient systems. Optimal management of water distribution systems (WDSs) requires accurate decisions to reduce the waste of environmental resources and to supply consumers with high quality water. Usually, these decisions are made under uncertainty scenarios, mainly due to the size of the network. The segregation of large networks into nearly independent water supply zones can reduce the uncertainty of the problem, thus allowing smarter operation to get better service conditions [1, 2].

District metered area (DMA) design was introduced in the UK [3] and has been widely applied to pressure management and leakage control [4–8]. Also, the identification of entrance pipes and consequent measurement of inlet flow allow improving the water balance, helping to identify leakage and nonrevenue water. However, network segmentation can be a hard task due to various network characteristics: size, number of loops, topology changes, and necessary modifications of the hydraulic conditions during operation. DMA creation requires not only a perfect knowledge of the topological information of the network, but also a set of criteria able to generate a consistent network partition. Depending on the criteria adopted and the combination of them, different topologies can be found that can improve (or worsen!) the efficiency of the network.

The use of trial and error methodologies, which do not consider global perspectives of WDSs, can result in nonoptimal solutions, thus reducing the possibilities of performance improvement through DMA creation. Various developments of automatic tools to help the process of water network partition have been proposed. Among others, [9] presents a graph theory approach to water distribution network decomposition; [10] applies a method based on machine learning for DMA design; [11] proposes the use of a multiagent based approach to negotiate boundaries in DMA generation; [12] develops an automatic boundary generation to determine DMAs based on a social structure, a tool from the artificial intelligence field; and [13] presents an insightful comparison among global clustering, community structure, and graph partition methodologies applied to two big cities; in this last case, the authors argue positively about the high performance of community structures for DMA design from the computational and clustering viewpoints; however, hydraulic and quality analyses are left out.

Graph clustering and, more specifically, the use of an unknown number of subdivisions, as in the proposal based on social network theory [14] have proven to be a good approach for the sectorization problem. Physical and hydraulic features such as the lowest distance to water source, node elevation, and cumulated demand are easily used as criteria to divide the distribution network. However, considering the approach that the a priori number of DMAs is not generally defined, the performance of each scenario is affected by the DMA design criteria used, which enables the generation of very diverse partitions.

While the former approaches on DMA design had pressure management and leakage control as the main objectives, nowadays multiple criteria have been adopted to generate resilient and high-performance networks in terms of hydroenergetic issues, as observed in [15], where the authors present a graph-theoretical approach for the resilience assessment of large scale WDSs.

More than grouping the nodes inside their respective DMAs, a complete sectorization process must include the selection of an optimal entry choice, that is to say, an optimal Control Unit (CU) placement, with at least one flowmeter to permanently monitor the inlet flow [14]. If necessary, pressure reducing valves (PRVs) may also be installed at the entries (also in other points) of the DMAs for pressure regulation purposes. The maximization of a reliability indicator [6] and the minimization of costs are the most common approaches to find the optimal PRV placement. Also, [16] develops an approach for optimal DMA definition considering the possibility of generating energy by using turbines; graph theory and clustering techniques are used, jointly with Simulated Annealing, to identify the optimal entry and pipe replacement.

Furthermore, water distribution network segmentation implies the closure of some pipes. The consequent reduction of loops in the network can affect directly the reliability of the network [6]. In this way, a suitable balanced scenario among the benefits associated with DMA creation should be considered (or developed), where costs, reliability, and efficiency are jointly taken into account. The evaluation of DMA scenarios can be an important design tool and can help the decision-maker to choose an optimal, hopefully the best, option of DMA configuration.

Reference [17] presents a set of feasible solutions of DMA design and evaluates the solution in terms of the resilience index [18], the number of closed pipes, and the water age. The authors conclude that sectorization can produce a small decrease of some performance indicators, but it is insignificant when compared with the benefits of DMA implementation.

Taking into account the variability of feasible solutions for different nodal aggregation criteria for the problem of DMA design, social network community detection algorithms, grounded in graph theory, are used in this paper to define a set of scenarios of DMA configurations. This approach allows evaluating the effect of DMA creation considering the total or partial isolation of the network and even considering the presence of cascading DMAs. To evaluate the performance of a DMA configuration, various values for the criteria are applied. In a second-level, particle swarm optimization (PSO) is applied to determine the optimal placement and operating points of PRVs by considering all the boundary pipes as possible candidates to become DMAs’ entrances. The results are evaluated in terms of resilience index, pressure uniformity, demand similarity, water age, electrical consumption, and implementation costs.

#### 2. Materials and Methods

##### 2.1. DMA Definition by a Community Detection Algorithm

Virtual social networks can be described as specialized graphs aimed at describing the interaction of elements that form part of a society, holding some degree of interdependence among them. In this context, an individual is an entity that generates a contribution to the society (network). One of the aspects of major interest in social network science corresponds to community detection, which allows understanding the organization and function of individuals in the network. Unlike traditional clustering in graphs, community detection is not solely focused on individuals, or nodes (and their features). It also takes into account the connection between them, so that the resulting communities are formed not only by individuals, but also by their interactions. One of the most widely known community detection algorithms is the Walktrap algorithm, proposed by Pons and Latapy [19]. This algorithm is based on random walks over graphs.

In the field of mathematics and probability theory, random walks (or diffusion processes) are defined as stochastic processes in which the position of a particle (walker) in a given instant depends on its previous position and a random variable that determines the direction taken from that previous position towards one of the neighboring nodes. Random paths in a (globally sparse but locally dense) graph tend to get trapped into densely connected parts, which correspond to communities [19]. A random walk in such a graph is a Markov chain that can be described by the information contained in the so-called transition matrix, . Element of , giving the transition probability from vertex to vertex , is calculated as the ratio , where is the () element of the adjacency matrix, , of the graph, and is the degree of vertex , that is to say, the number of its neighbors including itself. The th power of this matrix, , with elements noted as , gives the probabilities of moving from one node to another node through a path of length . As shown in [19], these probabilities are enough to gather information on the topology of the network.

In this algorithm, distances between nodes and communities (see (1) and (2), resp.) based on matrix are computed bywhere and are two given nodes, is the degree of node , and is the total number of nodes of the network. Equation (2) is actually a generalization of (1) to compute the distance between communities ( represent two different communities). For this last calculation, random walks go between randomly chosen nodes in both communities. Note that, in particular, any of those communities may be reduced to just one node, what provides the distance between that node and a community.

These distances are compiled in the so-called dissimilarity matrix of the graph, which is used to feed a clustering process. The dissimilarity matrix of a graph is, thus, the square matrix with elements that gives the distance between every pair of nodes and of the graph. When using the hierarchical agglomerative approach based on Ward’s method [20], one needs criteria to determine which communities to merge. In this method, the average of the squared distances between each node and its community is defined as an objective function, and the goal is its minimization:where corresponds to a given partition.

It is worth noting here two aspects: firstly, this function only depends on each community, and its minimization does not require information on other communities. Secondly, the method follows a “greedy” strategy; therefore, any time each pair of adjacent sectors are merged the variation of , , is calculated. The fusion that leads to the lower value of is selected as the new partition. This produces a hierarchy of partitions at different levels (dendrogram levels). According to [19], from this set of partitions, the best adapted to some specified requirements can be selected. One of the outcomes of the method is the partition that generates the maximum value of the so-called modularity index, which allows measuring the quality of the subdivision of the network in communities. Such a partition is the one that best reflects the modular structure (blocks of lower/medium diameters separated by bigger diameter pipes) of a WDS.

Let us observe here that the best partition generated by the algorithm can produce extremely small communities, whose implementation could be economically unfeasible. This is the reason why a recursive merging process (see pseudocode) is proposed in this paper, to ensure that all the sectors comply with a series of preestablished constraints.

In the next pseudocode the following notation is used:

(set of boundary pipes) and (set of candidate pipes) represent sets of pipes; index represents a pipe; the end nodes of are represented by , the initial node, and , the final node; represents a community or sector; refers to the characteristic used as a criterion (sector total length, sector total demand, sector maximum elevation, etc.); and represents an operation whose arguments are two values of* L*; the operation (sum, maximum, etc.) depends on the meaning of* L*.

*Pseudocode* Input: a partition of the network with maximized modularity index(1)A value of is calculated for every sector (community) in the partition(2)For every , if and (3)From , select the pipes whose end nodes belong to communities that meet a specified constraint for the considered characteristic ; build with those pipes.(4)For every , let and values such that and ; if meets replace and in the partition.(5)The characteristic of every in the new partition is recalculated.(6)Steps ()() are repeated until there are not more pipes entering in . Output: Sectors satisfying a series of constraints

The essence of the process is in Step , which states that two sectors and are merged only if their union produces a new sector that meets the constraint of interest . Note that only one characteristic can be used as a merging criterion.

It is also important to set a lower limit for the characteristic used to define the size of the sectors. So if at the end of the process there are some sectors with a value lower than the limit, they are declared as* minisectors* (with no valves or CUs). This only applies to minisectors that cannot be merged with larger sectors. In the case that a minisector shares at least one connection with a sector that has reached its maximum feature value, the maximum limit is slightly relaxed to allow their fusion. For example, if the characteristic that is used as a criterion is the sector pipe length (e.g., 30 km maximum constraint), the maximum final length that a given sector eventually will have will equal that maximum length (30 km) plus a value that is smaller than the minimum value of any other sector pipe length (e.g., 4 km); in other words, a value between 30 km and 34 km may also be accepted.

##### 2.2. Optimization Procedure

###### 2.2.1. Optimization Problem Description

Graph theory is useful to describe various problems related to water distribution networks. Following the formulation in [21], let , a graph, where the vertices in represent the nodes of the network and the edges in represent the pipes and other link elements. Once the number of DMAs is defined and all the nodes are classified, it is possible to identify a set of boundary pipes , whose elements (pipes or links) have different DMAs for their upstream and downstream nodes. Any of those pipes can be the entrance for a DMA or should be closed to generate an effective isolation. Temporarily, let us assume that a valve or control device is installed in each pipe , with diameter .

The choice of which pipe will be an entrance and which pipe will be closed can be translated into an optimization process, once the cost of the control device is linked to the diameter of the pipe . The cost associated with the control devices can be written as a function of diameter as where is the diameter of the control device installed in the boundary pipe with a cost , for a DMA scenario with a number of potential control devices.

The decision variables in this optimization level are the existence or not of control devices in each boundary pipe. This can be stated as a binary optimization problem, where the value 1 represents the existence of a control device, and 0 represents the closure of the boundary pipe.

Furthermore, the use of PRVs as control devices at DMA entrances can improve the pressure management and, consequently, reduce leakage inside the DMA. This is possible because the operating point of a PRV, also called set point, which corresponds to the pressure of the upstream node of the valve, can be defined according to the minimum pressure required in the DMA. In this sense, the choice of the PRV set point can also be written as an optimization problem, trying to operate the water distribution network with as lower pressure as possible. Pressure uniformity can be a useful indicator to find the optimal set point. Reference [22] uses this criterion to achieve network optimal designs. In this paper, as proposed in [23], this indicator is calculated by where is the simulation period, is the number of demand nodes in the network, is the pressure at node and time ; is the pressure required at node and time , and is the network average pressure at time . It should be observed, to avoid confusion, that the PU indicator does not represent “pressure uniformity,” as it appears. It is, in fact, a general measure of the lack of uniformity of the distribution network pressures relative to the desired pressure. So the minimization of (5) is the objective function for the set point optimization level, since this improves pressure uniformity, thus allowing the network to operate nearby the minimal required pressure and without large pressure differences in the network.

###### 2.2.2. Optimization Algorithm and Constraints

Reference [21] presents a clear proof that valve placement in water distribution systems is an NP-hard problem when the network is not a line, a loop, or a tree. To solve NP-hard problems, [24] suggests the use of heuristic algorithms, which can help in the treatment of these problems.

The use of bioinspired algorithms to solve water distribution problems has been a common and successful approach [25–29], mainly because of the easy implementation and near-global optimal solution ability exhibited by those algorithms, which do not need the calculation of Jacobian or Hessian matrices.

Particle swarm optimization (PSO) [30] has been widely applied for hydraulic problems such as optimal design of water distribution networks [31, 32], optimal design of wastewater networks [33], calibration of water supply networks [34], optimal pump operation [35] and is applied in this work to the optimal PRV placement and set point definition. A PSO swarm consists of a set of particles, which have two associated vectors: position and velocity. Usually, each vector starts randomly inside a defined range. The position vector is interpreted as a solution to the problem and allows the objective function evaluation. Position and velocity are iteratively updated, according to the following equations: where is the position of particle at iteration , updated using the velocity .

Velocity updating is a combination of(i)the last velocity value, weighted by the inertia parameter , to avoid excessive particle roaming;(ii)the difference between the best position of particle , , and its actual position weighed by the cognitive parameter ;(iii)finally, the difference between the best position of the particle and the position of the swarm leader, , weighed by the social parameter .

The random numbers and , working as particle scatters, avoid premature convergence to local optimal points. The last two summing elements in (7) are responsible for the convergence of the method because they attract particles to the best individual point a particle has visited and the best global point found by the entire swarm.

However, most of these algorithms work with unconstrained problems and require penalty functions to treat real constraints. In this case, for valve placement, the minimal pressure is considered as a constraint, while for the valve set point, the constraints are both minimum pressure and tank levels. The penalty functions and the final objective functions are presented in

Here penalizes the violation of the required pressure; is the penalty value when the final tank level is lower than the initial tank level for tank in a network with tanks. is the objective function corresponding to the control device placement, and is the objective function corresponding to the set point adjustment.

###### 2.2.3. Multilevel Optimization for Valve Placement

Complex, real optimization problems typically need to be described and solved by multiobjective methods. While single objective approaches for real problems must contemplate the presence of constraints, multiobjective problems can treat these constraints as new objectives to be fulfilled. Furthermore, many processes involve a hierarchical decision process, so that the solution of a single optimization can be used to solve, partially, another process or determine some constraints of another process [36].

If on the one hand, the use of multiobjective process can also solve multilevel problems, on the other hand the open final solution given by the Pareto front keeps the stakeholder at the core of the final solution. In this way, the use of multilevel optimization can be useful for real, complex problems, eventually producing a single solution.

A general multilevel optimization problem with decision variables , objective functions, , and constraints, can be described as follows.

Find the solution for the upper level problemwhere is the solution for the second-level problem:and so on, until the lower level problem , with solution .

In the case of optimal valve placement, an adaption of the multilevel optimization concept is applied to reduce the complexity of the optimization process. Reference [35] presents two groups of objectives that are minimized in the DMA creation process based on a multiobjective approach. The first group corresponds to structural costs, related to the installation of valves, while the second group is related to hydraulic aspects, such as minimum pressure and maximum resilience.

In this line, the first level corresponds to the optimal boundary pipe selection, by minimizing (10). The solution of this level reduces the number of valves for the set points determined in the second level of the optimization. In this way, the treatment of the problem in two stages can lead to a faster solution, mainly for large networks.

Figure 1 presents a flowchart to illustrate the entire process of DMA creation and optimal location of control devices, further than the optimal set point definition.