Abstract

The Ministry of Electricity and Water (MEW), as well as other water authorities all over the world, is facing a difficult challenge in assessing the physical condition of its distribution systems. Since the majority of the mains are buried, the MEW must rely upon indirect methods, including analysis of repair records. A case study on Kuwait’s water distribution system using the techniques of survival analysis is analyzed and investigated for modeling the pipe break failures. The proportional hazard model has the advantage of being able to separate the effects of component deterioration on failure due to aging from the effects of site-specific causes. Another desirable feature is its ability to analyze censored data. The sensitivity of the model parameters to sample size and percent censoring is examined through random sampling of the database. In addition, the proportional hazard model is suitable for describing failure rates of components.

1. Introduction

1.1. General

In the competition for scarce resources to repair and maintain our public work infrastructure, the use of risk and reliability analysis has been proposed for prioritizing maintenance and repair of constructed systems. One of the hindrances to the application of the models that have evolved to real-life systems is the lack of adequate information on the failure rates of the components that comprise the system.

In the case of water distribution systems, plenty of information is available on failed components. File cabinets at hundreds of water utilities are bulging with records of failed pipes, joints, valves, and pumps. Unfortunately, this reservoir of information has not been efficiently tapped to estimate parameters that are useful for maintenance and repair decisions.

Partly, this situation may be attributed to (1) the fact that relevant data are reported only on the components that have failed, and information on those that are still functioning is neglected; (2) the effect of component aging on failure is not separated from that caused by other explanatory variables.

By neglecting valuable information on unfailed components, a bias is introduced in the modeling of failure. Information on unfailed components may be considered by using the concepts of censored data analysis. In the second case, because the effect of aging, which is obviously time dependent, is lumped with other causes of pipe failure, which are usually location dependent (i.e., site specific), the failure models obtained have not been too successful. However, some newer models have shown some success in their applications [15]. The proposed model overcomes both drawbacks by using the proportional hazard model, originally proposed by Cox and other authors in [611]; this model has been used by Andreou et al. [12], Shamsi [13], and also in [1416].

1.2. Background of Kuwait’s Water Distribution Networks

Kuwait is located at the north western shore of the Arabian Gulf. This location is categorised as an arid region, which is marked by very scarce rainfall, high temperatures and evaporation rates, and a lack of perennial surface waters. It covers an area of 18,000 km2. The average annual rainfall ranges from 40 to 240 mm, and the total annual evaporation rate ranges from 2,500 mm in the coastal areas to more than 4,500 mm inland. Kuwait relies mainly on desalinated water from the Gulf for potable water. Currently, there are ten desalination plants in Kuwait with a total annual production of 159 × 109 imperial gallons (IG) (723 × 106 m3). In addition to distilled freshwater, Kuwait has a large supply of brackish groundwater. The Ministry of Electricity and Water (MEW) in Kuwait distributes the brackish water to the consumers through a separate network parallel to the freshwater network. This brackish groundwater is mainly used for irrigation, domestic purposes, and blending with distilled water [17, 18]. Fresh and brackish water are stored in reservoirs and elevated towers. For freshwater, the storage capacities are as follows: (a) 2.2 × 109 IG (10 × 106 m3) for reservoirs operated by gravity, (b) 2.1 × 109 IG (9.55 × 106 m3) for reservoirs operated by pumps, and (c) 55.2 × 106 IG (250,944 m3) for elevated towers. However, for brackish water, the storage capacities are as follows: (a) 498 × 106 IG (2.2 × 106 m3) for reservoirs operated by gravity, (b) 41 × 106 IG (186,400 m3) for reservoirs operated by pumps, and (c) 10 × 106 IG (45,460 m3) for elevated towers. All ground reservoirs are operated by gravity and are built on elevated ground. In addition, freshwater reservoirs are sterilized continuously.

The water distribution network pipes are mainly made of ductile iron with few steel-coated pipes. The diameter of the pipes is 3/4, 1, and 2 inches for household dwellings, commercial, and industrial establishments, respectively. The total freshwater consumption has increased from 3 × 106 m3 in the year 1957 to 160 × 109 m3 in the year 2018, which represents in 60 years a 53,000 fold increase. Tremendous changes had occurred to the freshwater distribution network over the years, from just 112 kilometers in length in 1959 to over 9,800 kilometers in 2018. Similarly, the brackish water distribution network had increased from 90 kilometers in length in 1959 to over 8,100 kilometers in 2018. Moreover, the number of consumers connected to the freshwater network totaled 177,118 and consumers connected to the brackish water network totaled 77,257 by the end of 2018. However, there were 1,416 and 183 pipe breaks for the freshwater and brackish water, respectively. Figure 1 shows the development of both the fresh and brackish water distribution networks in Kuwait from 1959 to 2018 [18, 19].

Figure 2 displays the total number of pipe breaks for both fresh and brackish water networks during 1999 to 2018. It is noticed that the number of freshwater pipe breaks are by far higher than the brackish water pipe breaks which is due to the higher use and consumption of freshwater daily. In addition, brackish water is available for customers once a week only.

2. Model Formulation and Discussion of Results

The Ministry of Electricity and Water (MEW), as well as other water authorities all over the world, is facing a difficult challenge in assessing the physical condition of its distribution systems. Since the majority of the mains are buried, the MEW must rely upon indirect methods, including analysis of repair records. The time to failure of a component is represented as a random variable, T > 0, with a probability density function f(t) and a distribution function F(t). The survival function S(t) and the hazard function or failure rate h(t) are defined as [2022]

On integration, one obtains

So that

The failure time distribution of pipes in a water distribution system may therefore be investigated either through its survival function S(t) or its failure rate h(t). It is instructive to work with the h(t) because its behavior can easily be examined in terms of commonly known probability density functions. For example, if T has an exponential density function, thenand it is easily seen that

Thus, the failure rate h(u) = λ = a constant. Since, intuitively, we know that the failure rate tends to vary with time, we conclude that the exponential model is not a suitable failure density function.

We would also expect the failure rate to depend on the age of the component as well as other structural and environmental factors. For water mains, experience tells us that some of these factors include age, pipe size and material, internal pressure, and environmental conditions such as cover conditions and soil corrosivity. Studies have been made to relate the failure rate to these explanatory variables. For example, Shamir and Howard [23] used regression analysis to express the failure time as a function of other variables. Their results have been quite unsatisfactory with coefficients of determination (i.e., R2) usually less than 0.5.

One shortcoming of this type of analysis is that by its nature, the data used in regressing failure time with the other explanatory variables cannot include information on those components that have not yet failed. This suggests that survival analysis which includes censored data may be a more productive technique for analysis. The proportional hazard model is examined in this paper. This model has a hazard function of the formwhere z is a row vector of explanatory variables, B is a column vector of regression parameters, and h0 is an unspecified baseline hazard function.

This model is attractive for the present problem because the effect of aging is separated from the effects of site-specific causes of failure. This model is proportional because of the multiplicative relation between h0 and the log-linear function of the explanatory variables. Because of this proportionality assumption, the ratio of the failure rates of two components with different sets of covariates does not depend on time.

The proportional hazard model has been successfully used as a tool for survival analysis in the biomedical field and industrial life testing. Software for survival analysis using the model is available as part of the BMDP statistical package [24]. With an accessible database, the software permits the identification of the significant variables that should be included in the covariate vector. The package also permits the stratification of data in cases where the subsets of the database may follow different patterns.

A desirable feature of the analysis is that it is able to include in the estimation procedure, information of both failed and unfailed components. It thus makes full use of the information. Survival analysis is in fact distinguished from other fields of statistics through this ability to analyze censored data.

The data used in this study were supplied by the Ministry of Water and Electricity (MWE) and consisted of information on 1,275 pipes in the water distribution system of the Kuwait city. For each pipe, the information included year of installation, length of pipe, diameter, soil corrosivity, material, number and age at break events, internal pressure, cover index, and soil stability. Only 21 percent of the pipes had experienced one or more breaks. Of these, 14.5% had 2 breaks or more and 1.7 percent had been broken 3 or more times. Data are thus censored since 79 percent of the pipes have not failed.

Prior to determining the proportional hazard model for the data set, several analyses were made to determine which explanatory variables are sufficiently significant to be included in the model and to determine if subsets of the data exhibit different failure patterns from other subsets. Procedures for these are described in the BMDP User’s Manual. The technique is also discussed by the authors in [6, 25, 26], who had performed similar preprocessing procedures earlier. The results of the prior analysis identified the following variables as significant:LNLEN: the natural logarithm of the pipe length;LOWPCT (in decimals): percentage of the pipe in low land use;PRESS: pipe pressure = kPa if it has not been broken before, or 0 otherwise;PREBRK: previous break =  if the previous break equals 2, or 0 otherwise;BEF70: a dummy variable = 1 if the pipe was installed during the years 1950–70, or 0 otherwise; andAFT70: dummy variable = 1 if the pipe was installed after 1970, or 0 otherwise.

PRESS was defined as above because it was found that the effect of pressure was not significant after the first break. Similarly, BRKRAT was included as defined because prior analysis suggested that if a pipe breaks often early, it was more prone to break again. BEF70 and AFT70 were included because stratification analysis indicated peculiar break patterns. This may be due to construction practices during these periods.

After the significant variables were identified, fitting the model entailed preparing the input file in the proper format to run the software. A summary of the results includes the value of the coefficient vector B and the baseline function:

These are tabulated in the last columns of Tables 1 and 2. The baseline function exhibits the well-known “bathtub” shape typical of many failure patterns.

Another objective of the study was to investigate the effect of censoring on the hazard function h(t). In order to do this, several data sets were generated based on sampling the given database. Through a random sampling procedure, additional databases were developed with values of N = 600, 900, and 1200, each having a different percentage of censored observations. These 9 simulated data sets were then analyzed as before to develop the hazard function and to examine how sensitive are the regression coefficients to sample size and percentage of censored observations. The results are important in deciding when to terminate a sampling program or how big a sample size must one have. The results are shown in Tables 1 and 2. Figure 3 displays the h0 (t) for N = 800 and different percentages of censored data. Figure 4 shows h0(t) for different Ns with the same censored data percentage. Moreover, the other advantage of this study is that the most important parameters (regarding pipe failures and breaks) for maintenance and repair decisions are identified.

3. Conclusions

The Ministry of Water and Electricity (MEW) in Kuwait has accumulated huge amounts of water pipe data over the years. However, most of the historical data kept in files have very limited applications. This study develops in detail the procedures for converting voluminous data in files into very useful and reliable information. Pipe failures and breaks in Kuwait were analyzed and modeled through the proportional hazard model. Parameters (regarding pipe failures and breaks) that are useful for maintenance and repair decisions are identified. It is shown that the proportional hazard model is suitable for describing failure rates of components. The hazard function varies with sample size and the percentage of censored data. This variability must be kept in mind when making inferences. It has the desirable feature of being able to separate the time-dependent effect of aging from other stress factors which are often site specific. Another advantage is its suitability for analyzing censored data, and thus can efficiently use information from a database. The availability of the software and the facility of its use is another advantage. Moreover, its ability to stratify nonhomogeneous data and identify the important explanatory variables make it suitable for failure modeling not only in the water distribution network but also in other constructed systems, such as sewer and gaslines, where it can give valuable insights about such systems.

Data Availability

All the data used are available in the manuscript and can also be requested from the authors.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are thankful to the Ministry of Electricity and Water (MEW) for supplying the relevant data.