Advances in Civil Engineering

Advances in Civil Engineering / 2019 / Article

Research Article | Open Access

Volume 2019 |Article ID 7081073 | 15 pages |

Implementation of Genetic Algorithm Integrated with the Deep Neural Network for Estimating at Completion Simulation

Academic Editor: Xuemei Liu
Received10 Oct 2018
Accepted03 Jan 2019
Published02 May 2019


In construction project management, there are several factors influencing the final project cost. Among various approaches, estimate at completion (EAC) is an essential approach utilized for final project estimation. The main merit of EAC is including the probability of the project performance and risk. In addition, EAC is extremely helpful for project managers to define and determine the critical throughout the project progress and determine the appropriate solutions to these problems. In this research, a relatively new intelligent model called deep neural network (DNN) is proposed to calculate the EAC. The proposed DNN model is authenticated against one of the predominated intelligent models conducted on the EAC prediction, namely, support vector regression model (SVR). In order to demonstrate the capability of the model in the engineering applications, historical project information obtained from fifteen projects in Iraq region is inspected in this research. The second phase of this research is about the integration of two input algorithms hybridized with the proposed and the comparable predictive intelligent models. These input optimization algorithms are genetic algorithm (GA) and brute force algorithm (BF). The aim of integrating these input optimization algorithms is to approximate the input attributes and investigate the highly influenced factors on the calculation of EAC. Overall, the enthusiasm of this study is to provide a robust intelligent model that estimates the project cost accurately over the traditional methods. Also, the second aim is to introduce a reliable methodology that can provide efficient and effective project cost control. The proposed GA-DNN is demonstrated as a reliable and robust intelligence model for EAC calculation.

1. Introduction

1.1. Research Background

The importance of early planning to final project outcomes is emphasized widely in the literature [1, 2]. However, these plans generally cannot be applied entirely, and they are revised throughout the project. Therefore, a constantly reviewed plan is required for an effective project management which must reflect the actual condition of the project; thus, the required actions can be taken when the project is going out of the control. Otherwise, cost overruns are noticed towards the end of the contract, and at this stage, remedial approaches may be ineffective and late. However, during the early phase of projects, the construction companies usually focus on the budget planning and generally ignore the other areas such as changes in cost, information update, and cost management [3]. In addition, construction companies mostly engage with computer systems which are reasonably powerful in the analysis of construction budgets. However, these computer systems cannot respond to cost changes at each stage of construction [4].

Cost project control is a crucial concern in the construction projects engineering. However, controlling cost is a time-consuming and difficult process. This is due to a high number of factors that affect the cost of projects and the influences of these factors which should be considered individually at each stage of the project [3]. Estimate at completion (EAC) is one of the important indicators to perform cost control [57], and the accuracy of the EAC is critical to identify the problems and develop appropriate responses; therefore, different methods for improving the accuracy of EAC are proposed in the literature. One of the widely used methods among the project managers to calculate EAC is earned value management (EVM) [810]. In this methodology, project cost, schedule, and scope metrics are integrated into a single measurement system to measure and analyze project actual status against its baseline and estimate the project total cost and duration at completion [11]. In the traditional EVM, there are three essential components required for project control, these are plan value (PV) or budget cost of work scheduled (BCWS), earned value or budget cost of work performed (BCWP), and finally actual cost (AC) or actual cost of work performed (ACWP). The EVM uses these indexes and different indexed formulas to calculate the EAC; however, methods developed by using index-based methodology are criticized due to the usage of just past information and performance index in the calculation of the remaining budget [12]. In addition, these models provide unreliable cost forecasts at the early stages of project life due to the limited number of EVM data [10]. Further, although accurate predictions in EAC are achieved with the traditional EVM methods when used on some special projects, there are obvious errors in most of the cases. This has led to an industrial situation of not knowing the right prediction approach to be selected for a specific project. In addition, the stability of the performance indexes should be provided to calculate reliable EAC, since the index-based approach can be performed using the data obtained from the cost management data provided to the project owner by the contractor in the form of a monthly project report [13]. Another drawback of the application of EVM is that revisions must be manually conducted as it is applied to each process of a construction project, thereby making EVM a complicated and time-consuming method.

1.2. Literature Review

A various regression-based approach has also been proposed as an alternative to the index-based approach as advantageous methodologies for performing cost estimation activities [11, 14, 15]. The computation of the EAC using the intelligence analysis usually involves the regression of dependent attribute variables (typically the actual project cost) against an independent variable (a predictor, typically time) using the linear or nonlinear model to establish the respective relationship between the predictor and the response. With the mimicking concept of the data-intelligence models, the issues inherent in the index-based techniques can be overcome, and thus can be used in a range of applications. Although the regression-based method is more sophisticated compared to the index-basedcost-forecasting method, it yields better predictions at the initial phase of a project [13].

Recently, advanced methods are proposed to overcome the traditional methodologies drawbacks. For instance, Caron et al. [16] developed a Bayesian model integrated with the EVM framework aiming to calculate the EAC. The proposed model was tested on oil and gas projects. Bayesian model evidenced its applicability and effectiveness on modeling the estimation at completion for the investigated case study. Narbaev and De Marco [11] developed a new cost EAC methodology by integrating the ES method and four growth models and concluded that the EAC formula based on the Gompertz model outperforms the other formulas developed in this study. Babar et al. [17] developed a new framework by integrating key performance indicators to the risk performance index to calculate EAC. Although there are advances in the calculation of EAC, systems development using advanced modern AI models is highly necessary to make progress in cost control; however, the use of AI is very limited in project performance control [18]. AI techniques can be used to handle complex and ill-structured problems by simulating the human inference capability. The development of AI techniques has been found in several areas of science and engineering such as in pattern recognition, computational learning, and solving the nonlinear and nonstationary problems [19, 20]. Computer systems fitted with machine learning techniques can efficiently perform without the need for an explicit programming, can construct data-trained algorithms, and can make data-driven predictions and decisions.

In one of the earlier investigations conducted by Iranmanesh and Zarezadeh [21], the authors tried to use the potential of the artificial neural network (ANN) technique to estimate the actual cost of an engineering project for the purpose to improve the EVM system. However, ANN has some disadvantages such as the selection of network structure, availability of local solutions leading to nonoptimal solutions, and time-consuming processes for training. Cheng and Wu [22] used a support vector machine (SVM) and fast messy genetic algorithm (fmGA) AI techniques to implement the Evolutionary Support Vector Machine Inference Model for construction management. The proposed model was validated for estimating buildings costs, and the model showed superior performance in conceptual cost estimation. In another study, Cheng et al. [4] used the same model for the EAC and obtained good and stable predictions compared to common EVM prediction methods. Cheng and Hoang[23] integrated least square SVM with machine-learning-based interval estimation and differential evolution to develop a new model for calculating the EAC. This model provides interval results with lower and upper prediction limits. However, SVM is criticized because the equal weights are given to all training data. The study was followed by another accomplishment on the implementation of the evolutionary fuzzy neural inference model for cost estimation [24]. The proposed model made the accurate prediction of the conceptual construction cost during the early stages of the project processes. Similarly, Cheng et al. [3] conducted a study on the conceptual cost estimation using the evolutionary fuzzy hybrid neural network for industrial project construction. The research outcomes showed another optimistic finding for the precise cost estimation of the early stages. Another investigation on the EAC determination was conducted by Feylizadeh et al. [25] using the fuzzy neural network. The modeling piloted was based on various factors including qualitative and quantitative that influence the EAC value. The results demonstrated a good outcome for the contractors and managers perspective. Wauters and Vanhoucke [26] developed a method based on SVR for performing EVM, and they concluded that their model outperforms the available best performing EVM and ES methods when the training set is identical to the test set. Most recently, an automated programming approach based on ANN model was developed for predicting the EAC of dam projects by Golizadeh et al. [27].

Although there have been several investigations since 2008 on the EAC estimation using soft computing models, the topic is still limited and required more attention by the experts, especially there is a limited number of studies about project cost control with AI techniques. It has been reported several limitations of the AI model exist such as black-box nature, the requirement of a significant amount of data, overfitting, models’ interaction, and time consumption [28, 29]. Especially, the traditional machine learning techniques such as ANN, SVM, evolutionary computing (EC), and adaptive neurofuzzy inference system (ANFIS) present several limitation performances compared to the deep machine learning techniques. The deep learning algorithms which are inspired by the deep hierarchical structures were introduced in the late twentieth century [30]. Since Hinton et al. [31] suggested the deep belief network (DBN) in 2006, several signs of progress have been achieved in deep learning. There were rapid developments in deep learning techniques over the past decade, with significant progress on several engineering applications. Consequently, in order to overcome the limitations of traditional machine learning techniques used in the calculation of EAC, a new model is developed by using deep learning algorithms. A brief description of the project information data is described in Figure 1.

1.3. Research Significance and Objectives

Considering the mentioned drawbacks and conclusions, it is imperative to design a fast and effective system which considers the issues of cost control during project execution for the prediction of project EAC by using AI methods. The aim of this study rallies on resolving the identified issues in project cost management through the collection of relevant historical data and studies about project cost management for the identification of the factors that significantly affect project cost. The historical data are collected from several construction projects located in the Iraq region. This project information is used to set up the trend of a project cost flow and the relationship between project EAC, and monthly costs were mapped based on historical knowledge and experience. Based on historical data, a new intelligent model called deep neural network (DNN) model is developed for the prediction and control of EAC variation during project execution. The suggested DNN model validated against the support vector regression (SVR) prediction model. The second phase of the current research devoted to the implementation of a hybrid evolutionary model called genetic algorithm (and brute force) integrated with deep neural network GA-DNN (and BF-DNN). The aim of applying the evolutionary phase as a prior stage for the predictive model is to allocate the correlated attributes to build the accurate predictive model. Again, the modeling of the hybrid intelligent model is authorized with the GA-SVR and BF-SVR. This step ensured that the identification of potential issues for effective measures to be timely implemented.

2. Research Methodology

2.1. Deep Neural Network

Several problems can be solved using the application of neural networks due to their ability to calculate any computable function. They are mainly useful in solving problems that can tolerate some levels of error or problems that are laden with several historical data but cannot be easily handled via the application of the hard and fast rules [32, 33]. The study on the ANN over the past few decades formed the basis for the deep learning concept. Neural networks (NNs) are constructed from several layered interconnected nodes called neurons. In a typical feedforward neural network, there is at least an input layer, a hidden layer, and an output layer. The number of features or attributes to be fed into a neural network corresponds to the number of nodes in the input layer and is analogous to the covariates or independent variables that will be incorporated in a linear regression model. The number of items to be predated or classified is represented by the number of nodes in the output nodes. The nonlinear transformation of the original input attributes is performed using the hidden layer nodes.

The construction of standard NNs requires the use of neurons to produce real-valued activations, and the NNs can behave as expected by adjusting the weights of the neurons. There may be several chains of computational stages during the training of NNs depending on the problem to be solved. Since 1980, backpropagation, an efficient gradient descent algorithm, has played a significant role in NNs by its capability of training ANN via a teacher-based supervised learning method [34]. The performance of backpropagation during the testing of data is not usually satisfactory although it presents a high training accuracy. One issue with backpropagation is that it is often trapped in local optima because it is based on local gradient information with a random initial point. Furthermore, there is a problem of overfitting if the training data are not reasonably large enough. Based on these issues, several effective machine learning algorithms such as SVM, ANFIS, and genetic programming which attain global optimum at lower power consumption have been used.

The layer-wise-greedy-learning method was proposed by Hinton et al. [31] to mark the introduction of deep learning techniques. This learning method was proposed based on the fact that a network should be pretrained via an unsupervised learning process before being subsequently trained by the layer-by-layer training. The dimension of the data can be reduced by extracting features from the inputs to obtain a compact representation. The samples will then be labeled by exporting the features to the next layer, and the labeled data will be deployed to fine tune the network. There are two reasons attributable to the popularity of deep learning methods: (i) the issue of data overfitting can be addressed by the development of the big data analysis techniques and (ii) nonrandom initial values will be assigned to the network during the pretraining procedure prior to the unsupervised learning. Therefore, a faster coverage rate and a better local minimum can be achieved after the training process.

Though several types of deep learning model exist, the focus of this discussion is on the deep neural networks that are constructed from multiple hidden layers often known as backpropagation neural networks. Deep learning is historically based on how to use backpropagation with gradient descent and a large number of nodes and hidden layers. This type of backpropagation neural network is indeed the first deep learning approach that showed a wide range of application. A typical DNN comprised of closely embedded input, output, and several hidden layers. The input and hidden layers are directly connected and operate together to weigh the input values to produce a new set of real numbers that will be transmitted to the output layer (Figure 2(a)). Finally, the output layer, based on the transmitted values, classifies or predicts the outcome of the process.

The main merit of the DNN is that the deep multilayer neural network is made up of several levels of nonlinearities which made them applicable to the representation of highly nonlinear and/or highly varying functions. They can identify complicated patterns in data and can be applied in natural complex problems. The connection weights connections between the layers, as in the single layer neural network, are updated to ensure the closeness of the output value to the targeted output.

Figure 2(a) describes the general architecture of DNN predictive model with more than one hidden layer (minimum two); input variable layer is denoted as 0 layer, and L layer represents the output variable layer. The mathematical procedure can be described as follows:where is the activation function, is the weight matrix, and is the bias. In this study, the implemented activation function for the excitation vector is sigmoid function owing to its applicability for regression problem:

Note that the outcome of is limited between (0-1), that emphases the sparse. However, it is the systematic activation function.

2.2. Support Vector Regression Model

Vapnik [35] proposed the support vector machine (SVM) as an optimization method which tries to separate a given training set by establishing a hyperplane within the original input space and allowing enough distance from the nearest instances on both sides to the hyperplane. In the regression problem, SVR model approximates the error between the input and output variables [36]. The errors are equal to the limited marginal of the SVR learning range as denoted in Figure 2(b) adapted from [37]. The investigated problem in this research is featured by nonlinearity pattern in which the mapping of the SVR model characterized by high-dimensional space also known as feature space. The notation presentation of the SVR model can be expressed as follows. Assuming there is given a set of training dataset represented by :where and are the input and output information. The regression nonlinear function implemented here to be solved is [38]where denotes the weight vector and presents the high order of the feature space, whereas the last variable of the function is the bias (). Well, the main goal of this regression function is to determine the output based on the training dataset , with a certain deviation of error called loss function from the actual observation of the whole training dataset. Hence, this can be described by the constrained convex optimization function [39]:where is the slag variable and is the positive regularization. Note that penalizes the training error through the loss function for the selected error tolerance, whereas the positive parameter shrinks the weight variables during the optimization process.

The optimization problem of the SVR model is usually elucidated using the Lagrangian multipliers, sequential minimal optimization [40]. Radial basis kernel function is employed for the feature mapping of the training datasets. The internal parameters of the radial basis function are tuned using the grid-search approach.

2.3. Genetic Algorithm (GA) Optimization

GA is a very well-known optimization technique that can be classified as an evolutionary method based on biological process [41]. Among various input variable selection approaches, GA exhibited a reliable and robust approach to multiple science and engineering applications [4244]. The effectiveness of this optimization approach is discussed comprehensibly in terms of solving the nonlinearity and stochasticity by Goldberg et al. [45]. The main processes involved the implementation of the heuristic GA includes the reproduction of chromosomes, crossover, and mutation. Note these processes are applied to satisfy the probability of the discretization of the input variables that are coded into binary strings [4648]. The GA processes integrated with the DNN predictive model are presented in Figure 3.

In Figure 3, the evolutionary algorithm (i.e., GA) was used as an optimization approach that mimics the concept of natural evolution. In the evolutionary algorithms, three basic concepts are involved: firstly, the parents create the offspring via crossover; second, the individuals within a generation have the chance of undergoing mutation (changes); and finally, the fitter individuals have a higher chance of survival (natural selection). It is now certain that attribute subsets can be represented with bit vectors; thus, there is a possibility of selecting all the features of a dataset with 10 features such as (1 1 1 1 1 1 1 1 1 1). The third attribute of the dataset can be represented using a bit vector in the form of (0 0 1 0 0 0 0 0 0 0).

In the evolutionary algorithm, the first step is the creation of a population of individuals which evolves over time. This initial step is known as the initialization phase of the GA. In the starting population, the individuals are randomly generated and represented as a bit vector like earlier described. These individuals can be created via tossing a coin for any available attribute, and based on the outcome of the probability toss, the attribute to be included in the population can be determined. There are no rules governing the size of the initial population; however, there must be at least 2 individuals in a GA population to proceed to the crossover phase. A perfect rule of thumb is the acceptance of between 5 and 30% of the total number of attributes as the size of the initial population. Having created the initial population, several steps need to be performed to reach the stopping criterion.

2.4. Brute Force Input Selection

Brute force (BF) is a systematic selecting approach that solves problems which require the enumeration of all the possible features [49]. This is for the sake of achieving a solution to specific problem and checking the suitability of each option towards satisfying the problem statement [50]. It has been selected in the current research as a benchmark input variable abstraction approach for the genetic algorithm. BF usually performed to find the divisors of a number n would list all the integers from 1 to n and check that each integer will perfectly divide n without any remainder. Although a BF search is easy to implement and will always establish a solution to the problem, its cost is directly related to the number of options considered, and this number tends to grow with the size of the problem in many practical situations. BF is therefore applicable in situations where the size of the problem is limited or in the absence of a specific heuristic method that can be used effectively to reduce the number of solutions to a considerable size. BF approach can also be used as a yardstick for benchmarking the performance of other algorithms. It is considered as one of the simplest searching approaches. The selection of this searching approach to be integrated with the developed predictive model was inspired from its potential in feature selection problem.

2.5. Modeling Development and Prediction Skills Metrics

The current research is conducted on fifteen construction projects executed in Baghdad city, Iraq. The detailed information about those projects is provided in Table 1. The construction duration of the projects ranges between nine to fourteen months. The established construction projects are related to residential projects.

Project nameTotal area (m2)Underground floorsGround floorsBuildingsStart dateFinish dateDuration (days)Contract amount ($)Prediction periods


The collected information of the projects includes cost variance (CV), schedule variance (SV), cost performance index (CPI), schedule performance index (SPI), subcontractor billed index, owner billed index, climate effect index, change order index, and construction price fluctuation (CCI). However, the estimate at completion is the main targeted variable to be estimated. The nine factors are used as predictors to determine the EAC. The 15 projects comprised 174 periods, 75% of the total periods (131 periods) are performed for the training phase, and 25% (43 periods) for the testing phase of the predictive models. The modeled historical data are processed through normalization linear scale between (0 and 1). This is for the purpose to supply the data for the programming environment with scaled numerical. The normalization is performed as follows:where is the normalized value of the calculated EAC, is the observed EAC, and and are the maximum and minimum values of the EAC. As an advance phase to the predictive model, GA and BF optimization algorithms are established to select the highly correlated variables to the EAC parameter and the procedure started from two variables. Then after, the predictive models DNN and SVR are applied. Two different software used to conduct the modeling strategies are Rapidminer and Neurosolutions. A simple structure for the proposed predictive hybrid model is exemplified in Figure 4.

Following various engineering applications and within prediction problems [5153], the applied predictive models are examined using several numerical indicators that present the absolute error evaluation (the closest to zero) and the best goodness (the closest to one). In that way, more justification can be done on the optimal model for the best input combination. The numerical indicators are root-mean-square error (RMSE), mean absolute error (MAE), mean relative error (MRE), Nash–Sutcliffe coefficient (NSE), scatter index (SI), and Willmott’s index (WI). The mathematical can be described as follows:where is the actual observation, is the predicted value, and and are the mean values of the actual and predicted value.

3. EAC Estimation Results and Analysis

As an advanced stage for the prediction process, the cost database of the selected projects was determined. The data represent the planned and the actual cost values for each month and the computed difference between them. The mathematical relationship between the nine (the abstracted input combinations) attributes and the EAC is explored using the potential of the AI expertise learning. The motivation of applying the AI models in computing the EAC is to overcome the drawbacks of the classical indexed formulations since AI models can mimic the human brain intelligence in solving complex real-life problems.

The primary prediction modeling was conducted for the stand-alone proposed DNN and its comparable SVR predictive model. Table 2 tabulates the performance prediction skills indicators using all nine declared variables. It is observable that DNN outperformed the SVR model through the prediction skills. In quantitative terms, DNN attained (RMSE‐MAE) and (NSE‐WI) as (0.130–0.077) and (0.496–0.741), respectively, whereas SVR attained the prediction indicators as (0.136–0.085) and (0.451–0.693). There is a notable augmentation between the proposed and predominated data-intelligence SVR predictive model.



The enthusiasm on coupling the input selection approach to the predictive model is to explore the predominant input combination correlated to the EAC magnitude. Note that, this is highly magnificent to recognize the main influenced variables during the project progress that affect the variance of the EAC results. The nature-inspired genetic algorithm was hybridized with the DNN to abstract the suitable input combination. On the other hand, brute-force selection procedure is used as a benchmark for the GA comparison.

The input combination and the prediction skill results of the hybrid model GA-DNN are indicated in Tables 3 and 4, respectively. By studying the archived modeling results in Table 4, Model 2 exhibited the excellent input combination for predicting EAC through including CV, SV, and CPI variables as inputs for the prediction matrix. The results showed minimum absolute error metrics (e.g., RMSE-MAE) (0.056–0.444) and best-fit-goodness (e.g., NSE-WI) (0.905–0.954). The hybrid BF-DNN model behaved differently (Tables 5 and 6); seven input variables represented in CV, SV, CPI, SPI, subcontractor billed index, change order index, and CCI gave the optimal prediction skills with minimum RMSE ≈ 0.040 and WI ≈ 0.97. Note that BF-DNN surpassed the capability of the GA-DNN model, however, with more features for comprehending the internal mapping relationship between predictors and predicted.

Number of inputsModelsInput variables

2Model 1CV, SVEAC
3Model 2CV, SV, CPIEAC
4Model 3CV, SV, CPI, subcontractor billed indexEAC
5Model 4CV, SV, SPI, owner billed index, CCIEAC
6Model 5CV, SV, CPI, SPI, change order index, CCIEAC
7Model 6CV, SV, CPI, SPI, subcontractor billed index, owner billed index, CCIEAC
8Model 7CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect indexEAC


Model 10.08430.0555−0.10820.78900.53820.02290.8710
Model 20.05660.44460.13920.90500.36120.01370.9545
Model 30.09100.06200.29430.75410.58090.00360.8722
Model 40.09190.0629−0.10930.74930.58650.03340.9106
Model 50.10100.06290.06710.70490.62680.02880.8561
Model 60.07770.0510−0.15940.82530.48230.01750.9182
Model 70.10020.05950.20290.70200.63940.01020.8404

Number of inputsModelsInput variables

2Model 1CV, SVEAC
3Model 2CV, subcontractor billed index, CCIEAC
5Model 4CV, SV, SPI, subcontractor billed index, climate effect indexEAC
6Model 5CV, SV, CPI, SPI, owner billed index, climate effect indexEAC
7Model 6CV, SV, CPI, SPI, subcontractor billed index, change order index, CCIEAC
8Model 7CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect indexEAC


Model 10.08430.0555−0.10820.78900.53820.02290.8978
Model 20.07120.57210.41540.84940.4546−0.00860.9229
Model 30.07500.05370.25520.83330.47830.00200.9179
Model 40.08020.05270.02500.80910.51180.01550.9038
Model 50.08900.05940.41410.76510.56780.00340.8778
Model 60.03990.0260−0.18730.95270.25550.01430.9791
Model 70.10090.06070.57620.69810.64370.01680.8654

The modeling input combinations and prediction skills results of the GA-SVR and BF-SVR are tabulated in Tables 710. In comparison with GA-SVR model, GA-DNN demonstrated that a remarkable enhancement in terms of the quantitative units measurable (RMSE-MAE) are decreased by (26.3–20.1%), whereas NSE-WI is augmented by 8.8–4.2%. This proved the capability of the GA-DNN model on mimicking the actual relationship of the project elements on the EAC phenomena.

Number of inputsModelsType of input variables

2Model 1CV, CCIEAC
3Model 2CV, subcontractor billed index, change order indexEAC
4Model 3CV, CPI, owner billed index, CCIEAC
5Model 4CV, subcontractor billed index, owner billed index, change order index, CCIEAC
6Model 5CV, CPI, SPI, subcontractor billed index, change order index, CCIEAC
7Model 6CV, SV, CPI, SPI, change order index, CCI, climate effect indexEAC
8Model 7CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, climate effect indexEAC


Model 10.09230.05710.38600.74730.5889−0.00640.8710
Model 20.08470.58200.39720.78740.5401−0.00160.8925
Model 30.07670.04680.40330.82540.4895−0.00640.9106
Model 40.10140.05880.46690.69480.6471−0.00010.8359
Model 50.10560.06410.63240.67720.65550.00740.8241
Model 60.10960.06520.50870.65230.68030.02440.8359
Model 70.11210.06290.78840.62710.7154−0.02290.8187

Number of inputsModelsType of input variables

2Model 1CV, SVEAC
3Model 2CV, SV, CPIEAC
5Model 4CV, SV, CPI, SPI, subcontractor billed indexEAC
6Model 5CV, SV, CPI, SPI, subcontractor billed index, owner billed indexEAC
7Model 6CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order indexEAC
8Model 7CV, SV, CPI, SPI, subcontractor billed index, owner billed index, change order index, CCIEAC


Model 10.08530.04970.21280.78400.54450.00620.8871
Model 20.07780.54310.35690.82040.4964−0.00630.9147
Model 30.06580.04270.26120.87150.4199−0.00070.9346
Model 40.09660.05760.39640.72350.6160−0.00040.8590
Model 50.09700.05010.49530.72080.61900.00700.8533
Model 60.09610.05930.42550.72620.61480.02280.8802
Model 70.11480.06380.58690.60880.73270.03160.8236

Scatter plot graphical exhibition is one of the excellent ways to visualize the correlation between the actual observations and predicted value. Figure 5 presents the diversion from the ideal line of the 45°. The presentation showed a noticeable agreement for the hybrid NN over the hybrid SVR model.

Figure 6 demonstrates the graphical presentation of three metrics including standard deviation, correlation, and root-mean-square error. The presented two-dimensional graph is known as Taylor diagram. In this diagram, a clear visualization can be determined from the optimal model input combination in accordance with the distance from the benchmark observed EAC data. Figure 6(a) shows DNN more accurate than SVR based on the magnitudes of the correlation and standard deviation. Figure 6(b) indicates that GA-DNN prediction model with Model 2 input combination attained the closest prediction value to the observed EAC. Model 6 presented the best input variables for the BF-DNN model (Figure 6(c)). Figures 6(d) and 6(e) denote a notable consistence with the optimal input combination with four variables (i.e., CV, CPI, owner billed index, and CCI) for the GA-SVR and (i.e., CV, SV, CPI, and SPI) for BF-SVR model. Finally, Figure 7 reveals the testing phase of the modeling for all the established predictive models. GA-DNN and BF-DNN showed a noticeable matching with the actual EAC.

To conclude the discussion section of the application, the main contribution of the authors highlighted the robustness of the hybrid GA-DNN that denotes two modeling phases. GA indicates the evolutionary nature inspired for the feature input selection and the DNN model as predictive model. The proposed methodology displayed a very positive result in which contributes the construction engineering project managers to monitor the cost at completion during the whole progress of the project.

4. Discussion

The applied methodology in the current research was inspired from the motivation of exploring new reliable approach for modeling EAC in construction projects. The proposed model distinguished itself by the capability of comprehending the actual mechanism of the related variables to the targeted variable with more solidity manners. This is a main essential perspective for practical implementation from construction project management. Overall, having the hybridization of the evolutionary optimization algorithm as a selective procedure, the prepredictive model (i.e., deep neural network) attained convincing results for the perspective of the scientific research and innovative modeling strategy exploration.

Based on the various statistical indicators, the best results indicated an outstanding evaluation performance with respect to the minimal absolute error measures and the best fit-of-goodness (RMSE and correlation value (R2)) equal to (0.056 and 0.91) using only three input attributes (i.e., CV, SV, and CPI). These findings are evidencing the capacity of the proposed hybrid model to achieve reliable prediction accuracy with less input variables. It might be noticed that BF-DNN model attained the performance indicators (e.g., RMSE and R2) equal to 0.039 and 0.95; however, this model required seven input variables information (i.e., CV, SV, CPI, SPI, subcontractor billed index, change order index, and CCI) to attain this level of accuracy. From the engineering perspective, GA-DNN is more robust data-intelligence model to be implemented in actual cases; as for project construction engineering, not all the time, information are available, and allocating such possibility of model give more credit for the engineering prospects.

5. Conclusion

In this research, a new hybrid data-intelligence predictive model called GA-DNN is explored for facilitating the construction managers with the reliable and robust methodology that control project cost and attain accurate estimation for the EAC. The implementation of this methodology is provided as an automation system where the project activities can be monitored, controlled, and any defective consequences can be avoided. The intelligence system comprises two phases: (i) the evolutionary phase of the genetic algorithm to abstract the influenced input attributes for the modeled prediction matrix and (ii) the DNN prediction model that uses the abstracted variables for each input combination to module the EAC. The BF input section procedure is used as a benchmark for the GA optimizer and SVR as a comparable prediction model. The results confirmed the predictability of the DNN over the SVR stand-alone models. In addition, the hybridization with nature-inspired input algorithm selection boosted the prediction outcomes. The devotion for future research is highly applicable for the current study where this methodology can be implemented on other construction projects as a real-time application where the contribution can be recognized in the form of a practical solution. This can be distinguished as the advantage of monitoring the project life in more reliable manners and subjective to the status of the project.

Data Availability

The authors are very much thankful to the construction companies for providing the data of the current research.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


  1. D. Dvir, “Transferring projects to their final users: the effect of planning and preparations for commissioning on project success,” International Journal of Project Management, vol. 23, no. 4, pp. 257–265, 2005. View at: Publisher Site | Google Scholar
  2. Y.-R. Wang, C.-Y. Yu, and H.-H. Chan, “Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models,” International Journal of Project Management, vol. 30, no. 4, pp. 470–478, 2012. View at: Publisher Site | Google Scholar
  3. M.-Y. Cheng, H.-C. Tsai, and E. Sudjono, “Conceptual cost estimates using evolutionary fuzzy hybrid neural network for projects in construction industry,” Expert Systems with Applications, vol. 37, no. 6, pp. 4224–4231, 2010. View at: Publisher Site | Google Scholar
  4. M.-Y. Cheng, H.-S. Peng, Y.-W. Wu, and T.-L. Chen, “Estimate at completion for construction projects using evolutionary support vector machine inference model,” Automation in Construction, vol. 19, no. 5, pp. 619–629, 2010. View at: Publisher Site | Google Scholar
  5. M.-Y. Cheng, N.-D. Hoang, A. F. V. Roy, and Y.-W. Wu, “A novel time-depended evolutionary fuzzy SVM inference model for estimating construction project at completion,” Engineering Applications of Artificial Intelligence, vol. 25, no. 4, pp. 744–752, 2012. View at: Publisher Site | Google Scholar
  6. D. S. Christensen, “Project advocacy and the estimate at completion problem,” Journal of Cost Analysis, vol. 13, no. 1, pp. 35–60, 1996. View at: Publisher Site | Google Scholar
  7. D. S. Christensen, “Using performance indices to evaluate the estimate at completion,” Journal of Cost Analysis, vol. 11, no. 1, pp. 17–23, 1994. View at: Publisher Site | Google Scholar
  8. F. T. Anbari, “Earned value project management method and extensions,” Project Management Journal, vol. 34, no. 4, pp. 12–23, 2003. View at: Publisher Site | Google Scholar
  9. A. De Marco and T. Narbaev, “Earned value–based performance monitoring of facility construction projects,” Journal of Facilities Management, vol. 11, no. 1, pp. 69–80, 2013. View at: Publisher Site | Google Scholar
  10. Q. W. Fleming and J. M. Koppelman, Earned Value Project Management, Project Management Institute, Newton Square, PA, USA, 2016.
  11. T. Narbaev and A. De Marco, “Combination of growth model and earned schedule to forecast project cost at completion,” Journal of Construction Engineering and Management, vol. 140, no. 1, 2014. View at: Google Scholar
  12. B.-C. Kim and K. F. Reinschmidt, “Combination of project cost forecasts in earned value management,” Journal of Construction Engineering and Management, vol. 137, no. 11, pp. 958–966, 2011. View at: Publisher Site | Google Scholar
  13. T. Narbaev and A. De Marco, “An earned schedule-based regression model to improve cost estimate at completion,” International Journal of Project Management, vol. 32, no. 6, pp. 1007–1018, 2014. View at: Publisher Site | Google Scholar
  14. A. A. A. Hammad, S. M. A. Ali, G. J. Sweis, and R. J. Sweis, “Statistical analysis on the cost and duration of public building projects,” Journal of Management in Engineering, vol. 26, no. 2, pp. 105–112, 2010. View at: Publisher Site | Google Scholar
  15. F. Khosrowshahi and A. P. Kaka, “Estimation of project total cost and duration for housing projects in the UK,” Building and Environment, vol. 31, no. 4, pp. 375–383, 1996. View at: Publisher Site | Google Scholar
  16. F. Caron, F. Ruggeri, and A. Merli, “A bayesian approach to improve estimate at completion in earned value management,” Project Management Journal, vol. 44, no. 1, pp. 3–16, 2013. View at: Publisher Site | Google Scholar
  17. S. Babar, M. J. Thaheem, and B. Ayub, “Estimated cost at completion: integrating risk into earned value management,” Journal of Construction Engineering and Management, vol. 143, no. 3, Article ID 04016104, 2017. View at: Publisher Site | Google Scholar
  18. L. L. Willems and M. Vanhoucke, “Classification of articles and journals on project control and earned value management,” International Journal of Project Management, vol. 33, no. 7, pp. 1610–1634, 2015. View at: Publisher Site | Google Scholar
  19. E. S. Brunette, R. C. Flemmer, and C. L. Flemmer, “A review of artificial intelligence,” in Proceedings of 4th International Conference on Autonomous Robots and Agents, pp. 385–392, Wellington, New Zealand, February 2009. View at: Google Scholar
  20. S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Pearson Education Limited, Kuala Lampur, Malaysia, 1995.
  21. S. H. Iranmanesh and M. Zarezadeh, “Application of artificial neural network to forecast actual cost of a project to improve earned value management system,” in Proceedings of World Congress on Science, Engineering and Technology, pp. 240–243, Venice, Italy, October 2008. View at: Google Scholar
  22. M.-Y. Cheng and Y.-W. Wu, “Evolutionary support vector machine inference system for construction management,” Automation in Construction, vol. 18, no. 5, pp. 597–604, 2009. View at: Publisher Site | Google Scholar
  23. M.-Y. Cheng and N.-D. Hoang, “Interval estimation of construction cost at completion using least squares support vector machine,” Journal of Civil Engineering and Management, vol. 20, no. 2, pp. 223–236, 2014. View at: Publisher Site | Google Scholar
  24. M.-Y. Cheng, H.-C. Tsai, and W.-S. Hsieh, “Web-based conceptual cost estimates for construction projects using Evolutionary Fuzzy Neural Inference Model,” Automation in Construction, vol. 18, no. 2, pp. 164–172, 2009. View at: Publisher Site | Google Scholar
  25. M. R. Feylizadeh, A. Hendalianpour, and M. Bagherpour, “A fuzzy neural network to estimate at completion costs of construction projects,” International Journal of Industrial Engineering Computations, vol. 3, no. 3, pp. 477–484, 2012. View at: Publisher Site | Google Scholar
  26. M. Wauters and M. Vanhoucke, “Support vector machine regression for project control forecasting,” Automation in Construction, vol. 47, pp. 92–106, 2014. View at: Publisher Site | Google Scholar
  27. H. Golizadeh, S. Banihashemi, A. N. Sadeghifam, and C. Preece, “Automated estimation of completion time for dam projects,” International Journal of Construction Management, vol. 17, no. 3, pp. 197–209, 2016. View at: Publisher Site | Google Scholar
  28. J. V. Abellan-Nebot and F. Romero Subirón, “A review of machining monitoring systems based on artificial intelligence process models,” International Journal of Advanced Manufacturing Technology, vol. 47, no. 1, pp. 237–257, 2009. View at: Publisher Site | Google Scholar
  29. J. V. Tu, “Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes,” Journal of Clinical Epidemiology, vol. 49, no. 11, pp. 1225–1231, 1996. View at: Publisher Site | Google Scholar
  30. W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol. 234, pp. 11–26, 2017. View at: Publisher Site | Google Scholar
  31. G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. View at: Publisher Site | Google Scholar
  32. R. Rojas, Neural Networks: Aa Systematic Introduction, Springer-Verlag, New York, NY, USA, 1996.
  33. R. Tadeusiewicz, “Neural networks: a comprehensive foundation,” Control Engineering Practice, vol. 3, no. 5, pp. 746-747, 1995. View at: Publisher Site | Google Scholar
  34. J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554–2558, 1982. View at: Publisher Site | Google Scholar
  35. V. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
  36. P. Lingras and C. J. Butz, “Rough support vector regression,” European Journal of Operational Research, vol. 206, no. 2, pp. 445–455, 2010. View at: Publisher Site | Google Scholar
  37. H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola, and V. Vapnik, “Support vector regression machines,” in Advances in Neural Information Processing Systems, pp. 155–161, MIT Press, Cambridge, MA, USA, 1997. View at: Google Scholar
  38. Z. M. Yaseen, O. Jaafar, R. C. Deo et al., “Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq,” Journal of Hydrology, vol. 542, pp. 603–614, 2016. View at: Publisher Site | Google Scholar
  39. Z. M. Yaseen, M. T. Tran, S. Kim, T. Bakhshpoori, and R. C. Deo, “Shear strength prediction of steel fiber reinforced concrete beam using hybrid intelligence models: a new approach,” Engineering Structures, vol. 177, pp. 244–255, 2018. View at: Publisher Site | Google Scholar
  40. J. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” Advances in large margin classifiers, vol. 10, no. 3, pp. 61–74, 1999. View at: Publisher Site | Google Scholar
  41. J. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, MI, USA, 1975.
  42. C.-L. Huang and C.-J. Wang, “A GA-based feature selection and parameters optimization for support vector machines,” Expert Systems with Applications, vol. 31, no. 2, pp. 231–240, 2006. View at: Publisher Site | Google Scholar
  43. A. G. Karegowda, A. S. Manjunath, and M. A. Jayaram, “Application of genetic algorithm optimized neural network connection weights for medical diagnosis of pima Indians diabetes,” International Journal on Soft Computing, vol. 2, no. 2, pp. 15–23, 2011. View at: Publisher Site | Google Scholar
  44. G. Stein, B. Chen, A. S. Wu, and K. A. Hua, “Decision tree classifier for network intrusion detection with GA-based feature selection,” in Proceedings of the 43rd Annual Southeast Regional Conference–Volume 2, pp. 136–141, Kennesaw, GA, USA, 2005. View at: Google Scholar
  45. D. E. Goldberg, B. Korb, and K. Deb, “Messy genetic algorithms: motivation, analysis, and first results,” Engineering, vol. 3, pp. 493–530, 1989. View at: Google Scholar
  46. E. Bruderer and J. V. Singh, “Organizational evolution, learning, and selection: a genetic-algorithm-based model,” Academy of Management Journal, vol. 39, no. 5, pp. 1322–1349, 1996. View at: Publisher Site | Google Scholar
  47. D. Golmohammadi, R. C. Creese, H. Valian, and J. Kolassa, “Supplier selection based on a neural network model using genetic algorithm,” IEEE Transactions on Neural Networks, vol. 20, no. 9, pp. 1504–1519, 2009. View at: Publisher Site | Google Scholar
  48. J. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE Intelligent Systems, vol. 13, no. 2, pp. 44–49, 1998. View at: Publisher Site | Google Scholar
  49. A. R. Osborne, “Simple, brute-force computation of theta functions and beyond,” in International Geophysics, A. R. Osborne, Ed., pp. 489–499, Academic Press, Cambridge, MA, USA, 2010. View at: Google Scholar
  50. M. J. H. Heule and O. Kullmann, “The science of brute force,” Communications of the ACM, vol. 60, no. 8, pp. 70–79, 2017. View at: Publisher Site | Google Scholar
  51. A. A. Al-Musawi, A. A. Alwanas, S. Q. Salih, Z. H. Ali, M. T. Tran, and Z. M. Yaseen, “Shear strength of SFRCB without stirrups simulation: implementation of hybrid artificial intelligence model,” Engineering with Computers, pp. 1–11, 2018, In press. View at: Google Scholar
  52. M. Hou, T. Zhang, F. Weng, M. Ali, N. Al-Ansari, and Z. Yaseen, “Global solar radiation prediction using hybrid online sequential extreme learning machine model,” Energies, vol. 11, no. 12, p. 3415, 2018. View at: Publisher Site | Google Scholar
  53. K. Khosravi, L. Mao, O. Kisi, Z. M. Yaseen, and S. Shahid, “Quantifying hourly suspended sediment load using data mining models: case study of a glacierized Andean catchment in Chile,” Journal of Hydrology, vol. 567, pp. 165–179, 2018. View at: Publisher Site | Google Scholar

Copyright © 2019 Karrar Raoof Kareem Kamoona and Cenk Budayan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

1847 Views | 469 Downloads | 3 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.