Abstract

Investment decisions are usually made on the basis of the subjective judgments of experts subjected to the information gap during the preliminary stages of a project. As a consequence, a series of errors in risk prediction and/or decision-making will be generated leading to out of control investment and project failure. In this paper, the variable fuzzy set theory and intelligent algorithms integrated with case-based reasoning are presented. The proposed algorithm manages the numerous fuzzy concepts and variable factors of a project and also sets up the decision-making process in accordance with past cases and experiences. Furthermore, it decreases the calculation difficulty and reduces the decision-making reaction time. Three types of risk correlations combined with different characteristics of engineering projects are summarized, and each of these correlations is expounded at the project investment decision-making stage. Quantitative and qualitative change theories of variable fuzzy sets are also addressed for investment risk warning. The approach presented in this paper enables the risk analysis in a simple and intuitive manner and realizes the integration of objective and subjective risk assessments within the decision-makers' risk expectation.

1. Introduction

The purpose of engineering investment is to obtain satisfactory returns; such decisions however are affected by considerable uncertainties. Expected revenue largely depends on the analysis and control of these incertitudes. These uncertainties have been a constant from the perspective of the entire investment process, and investment decision-making plays a fundamental role because it is the starting point of the entire investment process. According to the expert estimation and case studies of major projects, early decision-making exerts a magnitude of influence of 70% or higher over an entire project. A large number of projects fail due to the errors in initial investment decisions. In the investment decision-making process of large-scale projects, many risk factors can cause decision failure. The most crucial factors are the change of expected investment income, the increase of investment opportunity costs, the change of taxes imposed, the variations of market supply and demand relationships, the deficiencies of construction funding, and the backward technology. For the longest time, however, the construction phase has been accorded more attention in comparison with the investment decision-making stage despite the latter’s significance in project management. Risk identification, assessment, and management should be initiated early at the project decision-making stage to substantially reduce the investment risks and provide the scientific basis for improvement of the success rate.

Research efforts have been devoted towards risk management; nevertheless, the risk structure, the risk analysis, and the prevention methods remain in dispute in academia. A gap still exists between the actual and expected effects of risk control. Many decisions are based on the intuition, experiences, and subjective judgments. Project risk factors are complicated and the clarification of the correlation among these factors is difficult. Implicit assumptions suggest that risk factors are treated as isolated aspects in the investigation of comprehensive effects. Other studies focus only on static links such as qualitative influencing factors, the index weight, and so forth, instead of concentrating on the correlations especially the dynamic correlations among attributes and the dependence between targets and attributes. Risk correlations are found in a large number of projects. Case-based reasoning (CBR) is an effective data mining application in engineering project studies. It solves new problems by analyzing similar problems that have been encountered and resolved in the past. When faced with new problems, the management team can determine suitable solutions by searching for recorded cases of a similar nature, in effect, reusing past experiences. Should the cases found be deemed unsatisfactory, the team can modify the cases to suit the current situation and then record it in a case database to serve as a future reference; such an approach is a self-learning technique. It is a useful tool in handling anticipated complex problems, which are difficult to model in theoretical terms.

Marcous et al. [1] investigated the CBR to provide bridge management systems with a deterioration model that eliminates the shortcoming of Markovian model. Chua et al. [2] described a case-based reasoning bidding system that helps contractors with the dynamic information varying with the specific features of the job and the new situation. Chua and Goh [3] used the CBR to assist safety-planning teams in developing and improving safety plans for construction activities through the reuse of safety knowledge during the past time. Cheng and Melhem [4] combined the CBR with fuzzy to predict the future health condition of a bridge deck and recommended the appropriate maintenance, rehabilitation, and replacement actions. Ozorhon et al. [5] constructed the CBR decision support to demonstrate how the experiences of competitors can be used by contractors in the international markets, to support the market segment decisions. Ryu et al. [6] presented the CBR as a construction planning tool for various types of construction projects. Goh and Chua [7] used the CBR approach to utilize past knowledge in the form of past hazard identification and incident cases to improve the efficiency and quality of new hazard identification.

The diversity of cases can provide high reliability for summarizing correlations, but at the same time may interfere in the decision-making process. Therefore, encapsulating risk correlation rules of a certain confidence level using various CBRs and risk management cases is essential. Moreover, understanding risk rules not only reduces risks by shedding light on the essence of uncertainty, but also plays a key role in investment decision-making and risk prediction. We uncover risk correlations that can provide decision support for investment decision-making by combining research trends with the characteristics of engineering projects.

2. Risk Correlation of Project Investment Decision-Making

As a sub-branch of artificial intelligence, the CBR is a mode of reasoning that generates solutions to current problems by studying solutions to past problems of a similar nature stored in a knowledge database [8]. The method reuses past cases and experiences to solve new problems, evaluates new problems, explains abnormal conditions, and understands new conditions. Figure 1 shows the analysis flow in project investment.

The more cases stored, the more comprehensive the reference value. The research focus of the CBR is mainly on case storage, case retrieval, and similarity algorithm. However, it takes each case as independent items for research, rather than thoroughly studying the correlation between cases and attributes. Not only can it exclude the particularity of individual cases, but also reflects the essential characteristics of a wide range of cases to explore, at a certain confidence level, risk-inherent correlations using a large number of cases. Currently, the application of CBR in engineering projects is at its initial stage; thus, few studies on risk correlation mining have been conducted. However difficult, the key to exploring risk correlations and the factors relevant to these correlations is to determine the process involved in investment decision-making in engineering projects. On the basis of these insights, we uncover and compile three kinds of risk correlations (see Table 1) by combining the correlation mining methods applied in other domains and the risk factors present in engineering projects. The process is described as follows.

(1) An existing qualitative correlation is always existed as an influencing factor and index set, and it is also found in risk identification links. It has been accepted knowledge and this type of correlation is the one most easily identified. For example, as we know, investment decision-making composed of a series of first, second, and even multigrade risk indexes (see Figure 2). In risk prediction, the risk factors affecting target sets are first listed. Subsequently, these factors are divided into two grades or more according to the category they belong to. Subjective scoring methods, such as Analytic Hierarchy Process, fuzzy comprehensive evaluation, and so on, are typically used to calculate the degree of influence of each risk factor, that is, the weight that it carries [911]. As far as qualitative correlation is concerned, weight and method cannot be considered, although many risk factors can be listed by subjective experiences rather than by scientific methods. Confidence levels can even reach 100%.

(2) Derivation correlation mainly comprises type derivation, influence degree derivation, causality derivation, optimum derivation, and formula derivation. This paper provides examples of the above-mentioned derivations based on CBR, risk prediction, and risk management at the investment decision-making stage. The types of derivation correlations are discussed as follows.

(a) Type Derivation Uses the Clustering Method to Classify Existing Project Cases
It uncovers risk events, risk occurrence probabilities, and risk solutions of each type and summarizes these to serve as individual category markers. In new projects, cases can be searched and similarity can be calculated based on this derivation. It also can learn from the risk data and risk measures of its category for decision-making. When a new project is completed, it stores information in a case database for future project risk management. This approach is a process of self and incremental learning.
Take variable fuzzy clustering iterative model as an example. Suppose samples to be clustered compose a set, is the sample index number, is the eigenvalue matrix given in (2.1) that can be used for sample set clustering: where is the eigenvalue of index of the clustering sample ; .
Each index has a different dimension and magnitude which means that there are positive and negative indicators. Therefore, the original data must be normalized, and the normalized number must be in range. Different normalized methods can be used according to specific problems. Matrix can be used after the normalized transfers into the index eigenvalue normalization matrix in where = the index eigenvalue normalization number, is the index eigenvalue vector of sample .
Suppose the sample set is divided into classes, is the fuzzy clustering center vector of class , where ; , is the distance parameter, is the optimal criteria parameter, and is the index weight vector. Equation (2.3) are the optimal fuzzy clustering matrix and the fuzzy clustering center matrix :
In this model, the sample weights, relative membership degree, and cluster centers tend to be stable in the dynamic iteration. And the advantage of this model is that it not only considers the index weight but also the relative membership degree . Thus, the sample belongs to the class as another weight, resulting in a developed and perfect weight distance. And based on that, to some extent, the accuracy of type derivation could also be improved.

(b) Influence Degree Derivation Mainly Aims at Weight and Risk Consequence
Suppose risk event is more important or has more serious consequences compared with event , and is more important than . Certainly, is more important than . That is and , so . Thus, in the risk prediction and risk control of a new project, should be paid more attention to than and to avoid risk losses.

(c) Causality Derivation Is Similar to Influence Degree Derivation
Suppose that in some link, risk event is directly caused by event , and is directly caused by . Meanwhile, event is more serious than , and is more serious than . Hence, when event occurs, the transformation condition from to and to should be controlled in a timely manner to prevent a more serious from occurring. A causal correlation can be revealed from a wide range of existent cases, and this correlation can resolve the risk loss before it escalates.

(d) Optimum Derivation Consists of Project Time Optimization, Cost Optimization, Resource Optimization, Bi-Objective Optimization, and Multi-Objective Optimization
Combined with the construction period, cost, and resource allocation of completed projects, optimal project duration, and optimal cost interval can be summarized based on existent cases in each category. The construction period, cost, and resources of a new project can be reasonably controlled, based on category data. It can effectively reduce certain risks.

(e) Formula Derivation Mainly Uses the Western Economic Principles Associated with Mathematical Statistics Methods
It can be used as a reference value for improving risk assessment of investment decision-making to reasonably deduce the risk quantitative correlations. Because of the difficulties involved in risk quantitative analysis, there are a few investigations in such type of correlation which only limited to macroeconomic and financial risks. At present this type of correlation only includes the relationship between a single risk factor and the investment target. We should study not only on comprehensive effect of risk factors, but also on exploring quantitative relationship among risk factors. Because this type involves quantitative analyses, along with some potential assumptions in the derivation process, this correlation has inferior confidences but with better interestingness than the first type. Moreover it is on the basis of economics rigorous formulas and statistical inference; therefore it provides some scientific reference values. Because of the difficulties involved in risk quantitative analysis, this paper focuses only on risk measurement derivation. Risk measurement is one of the indicators in determining risk intensity. Risks are understood differently, bringing forth varied risk measurement techniques as well. Equation (2.4) is one representation of this type of derivation.
If the probability distribution of risk event is unknown, the empirical distribution of can be obtained through statistical analysis. Thus, the risk intensity of event is where is the mean value of sample which is expressed as where is the sample size, is the value of sample th sample point, and is the sample variance.
This derivation correlation process combines basic economics and statistics formulas with investment risk factors. Fluctuations in the prices of project resources or supply volume exert some influence on total investment. According to the western economics, a relationship exists between the resource price and the supply volume, depicted as follows [12]: where is the elasticity of supply price, is the resource price, represents the supply volume, is the incremental price of resource, and denotes the incremental volume of supply.
Suppose and , respectively, represent the rate of change in the resource price and the supply volume. Combined with (2.6), it is expressed as
Suppose is a random variable, is the constant while the market condition is stable, and is also a random variable with mean and variance as follows:
Calculating the risk degree of using (2.4), we obtain
From (2.9), the risk degree of is equal to the risk degree of .
Because of the market risk, the maximum resource price is as follows: (i)If does not consider the market risk, the initial value of the resource price would be .(ii)If has taken the market risk into account, then the mean of the incremental resource price would be .(iii)Because of fixed costs, the increment of total investment caused by risks is equal to that of the resource value consumption. Thus, the average increase rate of total investment induced by risks is where is the consumption of th resource, is the initial value of th resource price, represents the risk degree of th resource price, and denotes the initial estimation of total investment.
Such correlation study based on CBR is still relatively rare. In traditional project management, experiential knowledge is often lost at the end of the project. CBR, therefore, is not only a repository of existing cases, but also provides a platform for case summaries and knowledge mining. Because the derivation of such correlation assumptions and fault tolerance is allowed, the subsequent correlation presents lower confidence but is more interesting. Moreover, this correlation is based on CBR and concrete data of completed projects; thus, it is of scientific reference value.

(3) People are often interested in potential correlations hidden under data. Therefore, correlation mining through objective and subjective methods presents the highest interest of all the three correlations. Currently, research on such correlations is divided into two types: one focuses mainly on algorithm improvement and computer programming for algorithms; this approach has few applications in engineering. The other employs practical analysis, but targets only individuality and not generality. However, theory has to be indispensable for practical use. We therefore uncover six correlations that can provide decision support for investment decision-making in engineering projects. These are investment deviation prediction, schedule deviation prediction, quantitative and qualitative risk change, dynamic correlation of risk factors, risk warning threshold, and incremental and self-adaptive correlation. According to different sources of data, this type of correlation mining is divided into two categories: one is derived from qualitative data, and the other from quantitative data.

(a) Qualitative Data
Qualitative data is obtained mainly from expert scoring, Boolean values, characteristic values, and so forth. For example, index weight is a form of this correlation. It mainly depends on expert scoring, indicating that the important relationship among all the risk factors. Many studies on weight determination and improvement have been carried out. This paper focuses on studying the relationship among risk factors with a certain confidence level based on variable fuzzy sets, rough sets, Bayesian, decision tree and support vector machines, and so forth A fuzzy set usually has variability of time, space, and conditions, particularly in the engineering project investment. Because of the uncertainties that characterize a given project and its environment variables in the investment stage, fuzzy theory in engineering project research needs to upgrade mathematical theories, models, and methods. We use variable fuzzy sets for a better fit. To yield sound, adaptive, and heuristic investment decisions, as well as improve forecast quality and reduce reaction time in actual situations, the decision-maker would require intelligent algorithms other than CBR, such as rough sets, to mathematically address fuzziness and uncertainties. The decisions or classification rules can be derived through knowledge reduction based on the premise of invariable classification ability.
Eleven practical cases of the same category are analyzed, and their unit prices are mainly constrained by eight influencing factors. The eight factors and their grade distributions are enumerated in Table 2. The attribute classifications of each case are shown in Table 3 [13]. The rough sets method does not require any prior knowledge other than the dataset that requires processing; thus, we adopt this method to reduce the attributes that influence construction unit price based on the practical data shown in Table 3.
In rough sets, indicates that the universe is the finite set of objects. Suppose is an equivalence relationship in , and represents the set of all equivalence classes without . If and , then denotes the intersection of all equivalence relationships in is also an equivalence relationship. is the indiscernibility relationship in , denoted by . Therefore, denotes the knowledge related to the equivalence relationship with family ; it is usually denoted by .
Suppose , is the condition attribute set, and is the target set. Their respective equivalence relationships based on attribute values are as follows:
Calculating whether is equal to yields attribute cores , , and . Non-reduction attributes are not unique, and these are , , , , , and . The rough sets method decreases the condition attributes to five, which substantially reduces computational complexity and improves decision-making efficiency. In actual forecasts, the more practical cases there are, the better the scientific decision support for the project. However, as the number of cases increases, the dependence among risk factors may change. It is practical, therefore, to study incremental correlation, which allows for certain error rates in the investment decision-making stage.

(b) Quantitative Data
The primary sources of quantitative data are the objective data of each project. There are two approaches to process these data. In the first scheme, quantitative data are transformed into qualitative form by triangular fuzzy or trapezoidal fuzzy method, and then the correlations are derived from the qualitative data. In the second scheme, objective data are directly analyzed to explore correlations [1416]. In accordance with the second processing method, we analyze the comprehensive effects of risk factors on risk monitoring and risk warning via quantitative and qualitative change theories of the variable fuzzy sets method.
The fuzzy sets concept was proposed by Zadeh in 1965, which was then developed into a new mathematical discipline—fuzzy sets theory. However, fuzzy sets are a static theory that cannot describe the dynamic variability of fuzziness, fuzzy events, or fuzzy concepts. Theoretically, using static fuzzy sets theory to study the dynamics of fuzziness is an insufficient approach. Contradictions exist between theoretical studies and research objectives. Chen [17] proposed the relative membership degree and relative membership function in the 1990s. He established engineering fuzzy sets theory [18]. In the early 21st century, Chen [1821] created the variable fuzzy sets theory, which was a breakthrough in static concepts and theory of fuzzy sets.
Using variable fuzzy sets with the relative membership function to describe intermediate transition is a dynamic demonstration of fuzziness by precise mathematical language. Suppose is a universe, and is the element of , . and is a pair of opposite fuzzy concept in . At any point in the continuum number axis of the relative membership function, is a relative membership degree of to , and is a relative membership degree of to , where is opposite . , where and . Seen from Figure 3, on left pole : , ; and on right pole : , ; is the gradual qualitative change point whose continuum is to and to , and .

Suppose is the relative difference degree of to , and . It is seen from Figure 4 that point is where denotes the point at which dynamic balance with gradual qualitative change is reached. Points and are where and −1 represent the points at which mutational qualitative change is reached. Thus, the two forms of qualitative change, that is, gradual change and mutation, can be completely and clearly expressed by the relative difference degree.

Suppose is one variable factor set of , and .

is the variable factor set, represents the variable spatial factor set, denotes the variable condition factor set, is the variable model set, stands for the variable parameter set, and is the other variable factor set.

The standard models for evaluating quantitative or qualitative change in a variable fuzzy set are as follows:(i) The criterion for quantitative change is .(ii) The criterion for gradual qualitative change is .(iii) Two criteria are assigned to mutational qualitative change:(a) if the change occurs not through the gradual qualitative change point, ;(b) if the change occurs through the gradual qualitative change point, .

The data in Tables 4 and 5 were taken from a highway construction project [22]. We analyze the quantitative and qualitative changes in risk factors during the construction period to provide reference for determining the risk threshold value.

, where and , where is the linear equation of relative difference degree [23].

is the deviation rate of the total investment cost, and is the deviation rate of the schedule. This work comprehensively evaluates risks based on these two deviation rates. The relative difference degree of investment cost and schedule from February 2003 to September 2003 is calculated according to eigenvalues of and (see Table 6), respectively.

Suppose the weight vector of the two indexes is , and the risk relative difference degree of each month is . The relative difference degree of comprehensive risk of each month is shown in Table 7.

The tendency of the value of to move closer to −1 indicates high risk. By contrast, its tendency to move closer to 1 indicates low risk. Table 7 shows that the changes occurring from April 2003 to May 2003 are gradual qualitative changes, whereas the other continuous intervals are quantitative changes. The result in Table 7 is simple and intuitionistic. In addition, the decision-maker can combine the results with his own risk tolerance to determine the risk threshold required to implement appropriate measures. Our proposed method combines objective and subjective evaluations.

3. Conclusions

Research on risk correlation remains a bottleneck in current risk management in engineering project investment decision-making. We divide risk correlation into three types and elucidate the third correlation using actual data. The proposed approach combines data mining and variable fuzzy sets with investment decision-making, yielding simple, intuitionistic, and easily explainable results. Findings generated from this study provide reference value because it comprehensively considers risk prediction, risk management, and cost reduction analysis. Dynamics correlation and incremental correlation are the directions for further study.

Acknowledgments

This research work was jointly supported by the Science Fund for Creative Research Groups of the NSFC (Grant no. 51121005), the National Natural Science Foundation of China (Grant no. 51222806), and the Program for New Century Excellent Talents in University (Grant no. NCET-10-0287).