Pavement management systems (PMSs) have a primary role in determining pavement condition monitoring and maintenance strategies. Moreover, many researchers have focused on pavement condition evaluation tools, starting with data collection, followed by processing, analyzing, and ultimately reaching practical conclusions regarding pavement condition. The analysis step is considered an essential part of the pavement condition evaluation process, as it focuses on the tools used to find the most accurate results. On the other hand, prediction models are important tools used in pavement condition evaluation to determine the current and future performance of the road pavement. Therefore, pavement condition prediction has an effective and significant role in identifying the appropriate maintenance techniques and treatment processes. Moreover, pavement performance indices are commonly used as key indicators to describe the condition of pavement surfaces and the level of pavement degradation. This paper systematically summarizes the existing performance prediction models conducted to predict the condition of asphalt pavement degradation using pavement condition indexes (PCI) and the international roughness index (IRI). These performance indices are commonly used in pavement monitoring to accurately evaluate the health status of pavement. The paper also identifies and summarizes the most influencing parameters in road pavement condition prediction models and presents the strength and weaknesses of each prediction model. The findings show that most previous studies preferred machine learning approaches and artificial neural networks forecasting and estimating the road pavement conditions because of their ability to deal with massive data, their higher accuracy, and them being worthwhile in solving time-series problems.

1. Introduction

Road infrastructure facilities have essential and active roles in the advancement of cities and communities. Road infrastructure is considered the most significant factor for the welfare and comfort of people and roadway users. Also, it is one of the sectors that determine the socioeconomic development of countries [1]. Pavement management systems (PMSs) play an efficient role in monitoring, planning, evaluating, managing, and implementing capable recommendations to keep road pavement conditions in an acceptable health condition [1, 2].

However, in terms of monitoring, high-precision equipment must be used to monitor changes and any existing distress or damage on road surfaces. Pavement monitoring plays an essential role in assessing pavement conditions. Therefore, the monitoring results and in-filed collected data are used in formulating prediction models. After monitoring the pavement condition, pavement assessment strategies should be applied, and field surveys should be conducted for data collection to evaluate pavement infrastructure. Then, a decision will be made based on the relevant information of pavement conditions, and pavement maintenance procedures will be carried out based on the condition of the paving surfaces and expectations of pavement performance [13].

Moreover, PMSs concern the condition of road pavements after implementing maintenances and rehabilitation. Therefore, modeling the pavement performance is essential to transport agencies and governments at all management levels [4]. Lytton [4] mentioned that the future monitoring of pavement condition is called “prediction” or “forecasting,” which measures the future performance of pavement condition over time. After the prediction stage, recommendations will be taken regarding the appropriate maintenance and treatment to be implemented [5]. However, the challenge is to build the best prediction model by combining all road pavement and environment parameters and variables. Thus, building any performance model requires a predefined dataset that is divided into three groups, including (1) training data, (2) testing data, and (3) validation data [6]. Moreover, the prediction of pavement performance has been studied extensively by many researchers over the last decade, combined with great efforts from transport agencies to find and disclose the most accurate evaluation and forecast of pavement performance [79].

The prediction performance of pavement surfaces has been developed using field evaluation and experimental tests. American Association of State Highway and Transportation Officials focused on predicting pavement distresses and the future failure of the pavement. In addition to the experiments, prediction models are also required to assess pavement degradation patterns and possible future maintenance plans [4]. Many studies have used different types of prediction models, such as mechanistic models, empirical models, mechanistic-empirical models, machine learning models, and neural network models, to predict the future condition of road pavement [10]. Machine learning models are the most popular prediction models used to estimate the current and future conditions of road pavement degradation. Developing any accurate road pavement performance prediction model depends on two main factors, including accessing accurate databases and correctly identifying the influencing variables on road pavement degradation.

Developing accurate prediction models mainly depends on the precision and consistency of the monitoring and evaluation data. Many monitoring techniques have been used to evaluate the pavement condition and collect the information and details on the pavement health statutes. These techniques include vibration-based methods, vision-based methods, walk and look, and scanning techniques. Besides, international standard performance indices are used to inspect and evaluate the pavement condition under different scenarios, such as pavement condition index, international roughness index, present serviceability rating, and structural index. Each type of performance indices has a different way to conduct the data of pavement health status. Pavement condition index (PCI) and international roughness index (IRI) have significant contributions in pavement monitoring and condition estimation. Therefore, many researchers use the outcomes of these indices in building and developing their pavement performance prediction models. PCI is a subjective monitoring index that depends mainly on the visual inspection and the inspector’s experience. The PCI rating system consists of a scale from 0 to 100, where the worst pavement surface is at 0, while the excellent pavement condition is at 100. IRI is an indication of the level of surface smoothness. It can be measured using a profilometer. Also, there is a part of the IRI that depends on vibration-based methods and is called IRI (Proxy).

This paper is structured as follows: the subsequent section provides a general layout of the paper. It is followed by a general overview of data sources, while Section 4 reviews the existing pavement performance prediction models, depending on PCI and IRI. Section 5 presents the discussion and limitations of the existing pavement prediction models, followed by the future direction of the pavement performance prediction.

2. Data Source

This review paper presents different prediction models based on the database used. Some of the past research papers have focused on using the results of pavement performance indices as a database to build their prediction models, while others focused on using filed measurements or other intelligent techniques, such as image processing and vibration data, to collect appropriate databases. Several studies have divided the database of pavement degradation models into two categories, including an observation database and an online database. In the observation category, the data collection is conducted using visual inspections by equipped modes of transport, e.g., automobile, bicycle [1116], and intelligent monitoring techniques [17, 18]. In addition, many pavement prediction models used the long-term pavement performance (LTPP) or short-term pavement performance (STPP) dataset to predict the future pavement performance [1921].

Pavement performance indices are commonly used as key indicators to describe the condition of pavement surfaces and the level of pavement degradation. Thus, government and transport agencies use these performance indicators to define the required maintenance and rehabilitation measures. Moreover, since the last decade, many efforts have been made to develop pavement performance assessment procedures to be more accurate, cost-effective, and straightforward [2]. Many studies have been conducted to investigate the status and the level of pavement degradation using pavement performance indexes, including PCI, IRI, pavement serviceability index (PSI), and pavement condition rating (PCR). Moreover, pavement condition indices can be considered time-dependent variables [10]. To achieve the optimal goals of the high-precision rating system, IRI and PCI indices are used as main variables in developing pavement performance prediction models. Figure 1 shows the field data source of the pavement condition.

3. Applied Methodology

Researchers agreed that the optimum method to accurately monitor the pavement condition performance is by forecasting and using prediction models [22]. These models have the ability to describe the minimum and maximum changes in road pavement performance. Different types of performance prediction models are used to provide ultimate accuracy and precision. The subsequent sections describe the prediction models that are used to estimate the performance of pavement conditions. In the subsequent sections, there are time-series models that are used to predict pavement condition performance, which can be divided into two main categories, including probabilistic reasoning and shallow machine learning models. Figure 2 shows the selected time-series modeling.

3.1. Probabilistic Reasoning

Probabilistic reasoning is a way of logic exploration and representation according to a series of uncertain events and situations that depends on probabilities. In prediction performance, probabilistic reasoning algorithms have been widely used to predict pavement conditions’ performance for short- and long-term statues.

3.1.1. Mechanistic Empirical Models

These models can be used for the prediction of existing and future pavement degradation and maintenance activities. Also, mechanistic-empirical models are able to provide more reliable predictions with the future pavement condition. In addition, these models focus on the properties and qualities of pavement material. PCI and IRI are used to provide valuable information on pavement health status in this model type [23].

In 1989, George et al. [23] used a mechanistic-empirical model to predict future pavement performance. An empirical mechanistic model was developed based on PCI values. They used the PCI values over two years and approximately 2000 miles of road in Mississippi, USA, for three categories of asphalt surface state (flexible pavement with no overlay, with overlay, and composite pavement). Furthermore, the model focused on assessing the different types of degradation and distress of pavement surfaces and how they affect maintenance plans. Their study used six main parameters to develop a road pavement performance model, including traffic volume, pavement age, pavement structural number, material quality, and surface deflection measurements. A performance indicator was developed in their research to describe the interaction between the pavement roughness (PR) data and distress rating (DR) as follows:

The following condition performance prediction models include time-series pavement condition data with no overlay, with overlay, and the composite pavement is shown from equations (2) to (4), respectively [23].where a, b, and c are constants and are equivalent single axle loads ESAL, structural number SN, and the thickness of the last overlay T, respectively. In their study, George et al. [23] used the most significant and most effective variables like the pavement age.

Sidess et al. [24] proposed a model based on the combination of the empirical-mechanistic and the regressive empirical approach to predict IRI. Data were collected from the pavements, and a total of 165 road segments of 287.5 km of data were used in this model. The IRI degradation model was calculated as follows:where K and γ are regression coefficients, which are the functions of subgrade modulus, structural number at the time of pavement construction, and asphalt thickness. (W0) is the cumulative number of (130 kN) equivalent single axle loads ESAL applications leading to the increase of the IRI from (1.10 m/km) to (IRIini), where IRIini is IRI of the section at the time (tini), and Wt is a cumulative number of (130 kN) equivalent single axle loads ESAL applications applied until time t. (R2 > 0.9) for the predicted and measured data. Sidess et al. [24] had developed a similar model for predicting PCI degradation. The characteristics of mechanistic models are summarized in Table 1.

3.1.2. Empirical Models

Empirical models mainly depend on the results of experiments or field observations. Empirical models are known as models that relate the causes and effects. These models are more accurate at network-level analysis. In terms of evaluating future pavement performance, many studies have been conducted to predict the performance of pavement conditions. The online data and field observation are suitable for developing empirical models. The specifications of empirical models are summarized in Table 1. Figure 3 shows the categories of empirical models.

(1) Statistical Models. Statistical models use data from experiments or field measurements to make statements about the future changes of the experiment outcome. These statistical methods provided real-time solutions to complex problems. Attoh-Okine [25] and Marcelino et al. [26] studied statistical prediction models to measure and evaluate the future performance of pavement conditions. Also, the accuracy of statistical models was compared with artificial neural networks (ANN) [25]. The results show that statistical models are capable of generalizing and providing accurate road pavement performance models. The R2 value obtained was approximately 40%, and the standard error of IRI was 1.88.

(2) Recursive Partitioning. It is a part of statistical methods and nonparametric modeling. It is also used to determine a group of field measurements with similar parameter values. This method uses a decision tree to correctly classify the number of variables, such as the pavement age, traffic condition, weather condition, and pavement structure details. Inkoom et al. [27, 28] performed a model to predict the cracking condition on pavement surfaces using recursive partitioning and ANN. Approximately 5,814 pavement segments were selected in Florida, the U.S., and their eleven features. These features included the age of pavement, average daily traffic, truck factor, asphalt thickness, maximum posted speed, the functional class of pavement, and previous five-year pavement condition rating. 70% of the dataset was considered to be the training dataset, and the rest was used as a test dataset. Two models were investigated, one with all these eleven variables and another without the time-series of a pavement condition rating. The first model showed more accurate pavement performance prediction results than the second model. For the regression tree, R2 was found to be 89%, and for ANN, R2 was found to be 41.4%.

(3) Informative Feature for Prediction. Piryonesi and EL-Diraby [29] found a computational system for performing PCI using informative features for prediction. The results showed that using more categories of prediction classes and levels of distress, the accuracy of the pavement performance prediction model decreases. Using the 7-class scale was less accurate than using the original 5-class scale in predicting PCI. The most accurate prediction model was for three years with an accuracy of approximately (78 ± 4%), with a 5-class scale, and approximately (76 ± 4%), with a 7-class scale. The study concluded that pavement age and climate conditions were the most effective variables in building this prediction model [30].

3.1.3. Fuzzy Logic

Several studies focused on developing an innovative IRI prediction model based on fuzzy-based time-series and particle swarm optimization (PSO) techniques [31, 32]. In their study, Li et al. [33] revealed the importance of using PSO techniques to enhance the results of the performance models and future IRI prediction models. Furthermore, in their study, Li et al. [33] used an LTTP database to extract the IRI values for some urban roads in Canada. The methodology of this study focused on dividing the IRI values into granular spaces. For more illustration, they divided the IRI data into factors and subfactors. The factors section used the average IRI values from the long-term pavement performance database, while the subfactors data were measured in the left and right wheel path [34]. Moreover, a second-order fuzzy trend model was used to predict the performance of the IRI factors and subfactors data. Consequently, the fuzzy trend model was defined as follows:where U = {U1, U2, …, Un} is defined as a universe of discourse, A is a fuzzy set, fA is a membership function of the fuzzy set A, and fA (Un) is a membership degree of Un.

By comparing the innovative IRI prediction method with other modeling approaches, such as polynomial fitting, autoregression integrated moving average (ARIMA), and backpropagation neural network (BPNN). The results showed that the IRI prediction model achieved high accurate forecasting compared with other modeling approaches. The IRI prediction error of the proposed model was identified using root mean square error (RMSE) and relative error (RE) to evaluate the ability of each model to provide accurate performance prediction. The results revealed that the IRI prediction model was accurate enough with the smallest error values compared with other modeling approaches.

3.1.4. Probabilistic Modeling

Liu and Gharaibeh [35] focused on using probabilistic models to describe the change in pavement status and performance with time. They mainly used significant variables, such as average annual daily traffic, pavement layers thickness, layers air voids, layers liquid limits, layers asphalt content, and annual rainfall, to build an accurate prediction model. Abed et al. [9] developed a probabilistic prediction model of flexible pavement, where the thickness and stiffness of the pavement layers were used as variables. Besides, the mean values, standard deviations, and probability distribution functions of these two parameters were considered to be variables. For this study, a road section in Nottingham, U.K., was selected as the case study. This road had a four-layer pavement, including a surface course, base course, sub-base course, and a compacted subgrade. The layer thickness and stiffness variations and their probability distributions were collected from previous research. The random thickness values of each layer were calculated by the Monte Carlo method. Pavement temperature and traffic volume were calculated for future predictions. In their study, KENLAYER software linked with MATLAB software was used to calculate the bottom-up fatigue cracking, top-down fatigue cracking, and pavement deformation as pavement responses at predefined critical locations of the pavement. The model was simulated for thirty years. It was found that the pavement layer thickness and stiffness had played a significant role in pavement performance. The mean values of the predicted performance indicators were increasing over time, however, the standard deviations of these were also increasing.

(1) Markovian Models. Different studies developed a probabilistic method using the Markov chain framework to characterize pavement conditions and predict pavement performance [36]. The prediction model was formed based on IRI data from the National Department of Transportation in Costa Rica. The IRI data were conducted for 2004, 2006, 2008, and 2010, and then, the prediction model was developed to predict the pavement performance based on the IRI data in 2020. The modeling process was divided into three stages, namely data collection and analysis, model development, and model validation. Transition probability matrix (TPM) was used based on the Markov chain process (MCP) to correlate pavement degradation with explanatory variables.

Moreover, the importance of using TPMs was to predict pavement performance in the subsequent specific years. At the same time, significant variables were used, including the thickness of pavement layers, structural number, and the number of wheel passes per unit strength of the pavement. The Markov prediction performance model results revealed that using the probabilistic model in predicting pavement performance during a specific time is reliable. Moreover, the TPM results showed more accurate pavement performance prediction, as the percentage of the errors will be minimized after applying the optimization techniques. One of the main advantages of using this probabilistic model is the ease of modeling pavement degradation and the ability of these models to help decision-makers for better planning and management (see Table 1).

3.1.5. Other Deterministic Models

Chen and Zhang [37] published a research paper on the evaluation of IRI based on the pavement degradation prediction model, which depends on four different deterministic models, including the Al Omari–Darter model, Dubai model, and the Transportation Research Board's National Cooperative Highway Research Program (NCHRP) model. This comparison between the models was performed to identify the most accurate deterministic model in predicting pavement performance based on two main effective variables, including pavement age and thickness of pavement layers. Furthermore, Chen and Zhang [37] obtained the IRI data and other models related to data from the LTPP database in New Mexico. The IRI-based pavement degradation prediction model is divided into two main classifications, prediction of IRI (Al Omari–Darter and Dubai models), and prediction of other performance predictions based on IRI (NCHRP model). In the selected deterministic models, Al-Suleiman and Shiyab [34] developed a new prediction model (Dubai model) based on pavement age. The IRI data that was conducted in the left and right wheel path during vehicle movement, and the following equation (7) presents the Dubai model. The goodness of fit, R2, was 0.801, which is relatively high and provides a good indication of the pavement condition.

Furthermore, Al Omari–Darter [38] found a prediction model based on IRI values and the Rut Depth (RD). Later on, they tried to elevate the model using the standard deviation (SD) of RD for higher accuracy.

The significance of work was measured depending on the R2 value, which was 0.93 for the IRI-RD model and 0.94 for the IRI-SD model. The models are shown in equations (8) and (9), respectively.

Moreover, the NCHRP model was developed using an exponential regression model to predict the pavement serviceability index (PSI). The goodness of fit, R2, of this model was relatively low, as it was 0.73. However, in 2008, the New Mexico Department of Transport [39] reported this model as follows:

Chen and Zhang [37] found that the Dubai and NCHRP models were accurate for pavement performance prediction regarding pavement age and thickness. The Al Omari–Darter model provided less capability to predict the performance of pavement conditions in terms of pavement thickness (Table 1).

Table 1 below presents the previous studies that used probabilistic reassuring to predict pavement performance. The table also shows the technique used for each type of pavement indices to perform the prediction. The data sources are provided with the standard matrices used to measure each developed model’s validation and accuracy. Besides, the strength and weaknesses of each model are presented and discussed.

3.2. Shallow Machine Learning

Shallow learning is a branch of machine learning algorithms that depends on expert-based descriptions. The datasets in shallow machine learning need to be preprepared and predefined with all required features. Regarding prediction performance, shallow learning algorithms have been widely used to predict and estimate the condition and performance of pavement health status.

3.2.1. Artificial Neural Network (ANN)

The artificial neural network (ANN) is a complex model developed to simulate the thinking ways of the human brain and its ability to solve problems by offering various alternative solutions. The use of ANN in pavement performance prediction became widely known because of the accurate prediction results. The existing ANN models that are used in the literature for pavement performance prediction are presented in Figure 4.

Alsugair and Al-Qudrah [42] and Serin [31] measured the future performance of pavement conditions using ANN. This technique involves artificial intelligence, and many researchers favor its use in predicting pavement conditions. Besides, some researchers utilized a regression model and ANN to predict the probability of degradation on asphalt pavement and roughness distress level [43, 44]. The characteristics of ANN models are summarized in Table 2.

Moreover, some pieces of research focused on predicting PCI based on different optimizing techniques [45]. For instance, Shahnazari et al. [46] used ANN and genetic programming (GP). In their study, PCI data were collected based on field observation using an automated car for different urban roads in Iran. The data collection phase focused on measuring PCI values for most common pavement distresses, including cracking (alligator, longitudinal, edge, and transverse), potholes, patching, and bleeding. The type of pavement distress was used as an effective variable for the pavement performance prediction model. In their study, they used 80% of the dataset as a train set and 20% of the dataset as a test set.

In addition, Shahnazari et al. [46] assessed the accuracy of the previously mentioned models by determining R2, RMSE, and mean absolute error (MAE). The results showed that the value of R2 for the ANN and GP models was 0.99. Therefore, the results indicate that these models are reliable for predicting pavement performance using PCI values.

Jalal et al. [47] also developed an ANN model to predict PCI based on observed and experimental measurements at different locations in the Texas University campus. They also applied an optimal ANN model to enhance the accuracy of the conventional ANN model. Three types of pavement, including asphalt concrete (AC), hot mixed asphalt (HMA), and Portland cement concrete (PCC), were evaluated during the period 2014 to 2016. Furthermore, two other main variables were used to build the model, namely the annual average daily traffic (AADT) and traffic loads. The study showed that the proposed ANN model was accurate for the selected types of pavement. After applying the optimal ANN, the results revealed that there were improvements and enhancements in model outcomes and limitations in errors.

The international roughness index prediction model is a time-series prediction performance model. Therefore, many effective variables, such as pavement thickness, cracking level, traffic volume, resilient deflection modulus, structure number, climate condition, must be carefully collected. In 2000, a report from the Highway Development and Management Series [48] stated that the previous variables are essential variables used to construct a degradation model, as shown in equation (11) for one year.where ΔIRI represents the total rating changes in the IRI values, IRIs rating changes because of structure deformation, IRIc rating changes because of cracking, IRIr rating changes because of rutting, IRI rating changes because of potholing, and IRIe rating changes because of the environment during a year.

ANNs and a group method for data processing models were developed by Ziari et al. [49] to predict asphalt pavement in a short-term performance for a year and two years. Also, the full pavement life cycle prediction was carried out as the long-term prediction performance. Furthermore, Ziari et al. [49] used the IRI values from the database of the PMS datasets in the U.S., and they selected nine effective variables to indicate the performance of pavement conditions. The nine variables were selected carefully to provide clear indications of the condition of pavement surfaces and the affected factors.

The R2 and RMSE were used to assess the quality and ability of the models to provide accurate and validated results. Furthermore, three more error indicators were examined, including mean absolute presenting error (MAPE), correction factor (CF), and variance account for (VAF) to identify errors in the proposed models and to provide optimum correlations for ANNs and group method for data processing models. The benefit of using the GMDH is that it focuses on predicting a complex system without the need for assumptions.

ANN models have specific features compared to other models. For instance, they have a high ability to work with and predict complex systems. Moreover, these models are more efficient and provide high-accuracy pavement condition predictions. Ziari et al. [49] mentioned that the ANN models always provide minimum error values compared with other models. Moreover, clear illustrations of the effect of each variable and parameter on the performance of pavement conditions are always provided in the modeling results, which represent one of the many advantages of using ANN models. Consequently, the results showed that the ANN model is important and accurate in predicting short- and long-term performance, while the group method for data processing model is unable to be used with the IRI values and the nine significant variables to predict the paving condition performance in neither the short-term nor the long-term pavement life cycle.

(1) Back Propagation Neural Network Model. Lin et al. [50] elevated the accuracy of using the backpropagation neural network model in pavement performance prediction. The model showed that there was a variation in correlation values, and the best value was approximately 0.94. Moreover, the results indicated that potholes, rutting, and patching presented the highest correlation coefficient, implying a clear correlation with IRI values. However, concerning other types of pavement distress, such as cracking, alligator cracking, and bleeding, they showed a low correlation with IRI values, which means less ability to correlate the types of pavement distress and IRI values. As Lin et al. [50] stated, this type of model is easy to implement and can simplify pavement inspection for transport agencies. It also provides clear information on the relationship between the type of distress and IRI values during long-term performance prediction. However, using this model was deficient in relating some type of pavement distresses with the conducted IRI values. The characteristics of the backpropagation neural network model are summarized in Table 2.

(2) Radial Basis Function Neural Network (RBF). Karballaeezadeh et al. [51] proposed a model to predict PCI from the falling weight deflectometer (FWD) deflection data. FWD deflection data were collected from selected 236 pavement segments of the Tehran-Qom freeway in Iran. PCI was calculated in each segment by inspection. Data analysis were done using five different methods: multilayer perception neural network optimized by Levenberg–Marquardt (MLP-LM), multilayer perception neural network optimized by the scaled conjugate gradient (MLP-SCG), radial basis function neural network optimized by genetic algorithm (RBF-GA), radial basis function neural network optimized by the imperialist competitive algorithm (RBF-ICA), and merging these four with committee machine intelligent systems (CMIS). Results from these five methods were compared with four statistical parameters: average percent relative error (APRE), average absolute percent relative error (AAPRE), RMSE, and standard error (SE). However, it showed promising results for the five selected models but depended only on the accuracy of FWD data (Table 2).

3.2.2. Machine Learning Algorithms

Machine Learning (ML) methods are an area of artificial intelligence. ML techniques are widely used in pavement performance prediction because of high-precision results. ML techniques can be divided into two main categories, including support vector machine and hybrid machine learning (Figure 5).

Piryonesi and EL-Diraby [29] developed a cost-effective prediction model using a machine learning algorithm and LTPP database. This prediction model focused on estimating the pavement condition and surface distress using PCI over 2, 3, 5, and 6 years. In the study, different attributes were used to simplify the proposed model used by transport agencies and governments with minimum operating costs. Moreover, the researchers tried to change the PCI rating scale to be a 7-class scale instead of a 5-class scale. This attribute was applied as a trial to enhance the evaluation procedure of PCI. Furthermore, many attempts were made to measure the PCI values using the 7-class scale and a prediction model that was performed to evaluate the PCI measurements in both class scales over the selected years. The study also used influential variables, such as the age of pavement, type of pavement, AADT, average daily maximum and minimum temperature, climate condition, and functional class of the pavement (Table 2).

(1) Support Vector Machine (SVM). Wang et al. [52] used a combination of grey relation analysis (GRA) and support the vector machine regression (SVR) for the prediction of asphalt pavement performance. GRA was conducted to select major factors affecting pavement performance, and SVR was done using those factors to predict pavement performance. Data were collected from Guangyun Expressway. Road temperature, humidity, and wind speed data were collected from the installed weather station. Temperature and humidity sensors were installed inside pavement layers and on the pavement surface. During the GRA analysis, twelve factors were found to be more influential than others. These were equivalent single axle loads, maintenance funds, pavement structure strength ratio, a mean value of soil moisture, the highest temperature in the middle surface, the highest temperature in the road surface, annual cumulative total radiation, annual average rainfall, the lowest temperature in the middle surface, the highest temperature in the upper surface, the lowest temperature of the upper surface, and the highest temperature in the lower surface. Finally, GRA-SVR, grey method (GM), genetic algorithm-backpropagation (GA-BP), and pavement performance index (PPI) models were applied to predict the rutting depth index (RDI). Compared with the other three, GRA-SVR was found highly accurate and time-independent though the modeling process was complex.

On the other hand, Ziari et al. [53] performed a support vector machine model to predict pavement performance conditions based on IRI measurements and LTPP, and a mathematical approach was used at the same time to prepare the existing data to validate the model and to investigate the interaction between the performance model and the model variables. Their research paper used the dataset consisting of five kernels types of the support vector machine algorithms and IRI data. The five kernels were tested, including the polynomial kernel with degrees 1 to 3, Pearson VII universal kernel, and the radial basis function. Moreover, the nine variables include the pavement layers thickness, equivalent single axle load, annual average daily traffic, average daily traffic, annual average daily truck traffic, environment changes, annual average temperature, pavement age, and annual average precipitation. They are formed to build the prediction model, see Table 2.

Three nonlinear kernel equations were applied to describe the prediction model equations (12)–(14). These equations represent the polynomial, radial basis function, and Pearson VII universal, respectively [54].

The RMSE and the correlation coefficient were examined to find an accurate performance model. They found that the Pearson VII universal kernel was the best and significant kernel of the support vector machine model. Additionally, it matched the IRI measurements and the health status of pavement.

(2) Hybrid Machine Learning. Hoang [55] introduced a model to identify patches on asphalt pavement. Images were analyzed to get numerical features, and then, with these features, a hybrid machine learning model determines the output label as nonpatches and patches. A set of one thousand images were collected during a pavement survey in Danang City in Vietnam. The photos were fixed to be 100 × 100 pixels. They were labeled as nonpatches and patches by human inspectors for training. From an image, a total number of thirty-four features were identified. The least-squares support vector machine (LSSVM) was used for training with differential flower pollination (DFP) as a fine tuner. LSSVM model had an accuracy of 95.3% in predicting the road pavement condition. Compared to previous models, it can work on color images, though one of the model's limitations was that the feature selection algorithms were not established during the model construction phase (Table 2).

3.2.3. Regression Models

The regression modeling measures the interaction between input (independent) variables and output (dependent) variables. It is a time-series forecasting model widely used to predict pavement performance conditions. There are various regression models, including random forest regression RFR, ordinary least squares OLS regression method, simplified regression model, and stepwise regression technique, see Figure 6. Madanat and Ibrahim [56] and Roberts and Attoh–Okine [57] used the traditional regression technique to evaluate and predict road pavement degradations.

(1) Random Forest Regression (RFR). Gong et al. [58] developed a random forest regression (RFR) model to predict the IRI of asphalt pavement using the training and testing sets. Pavement distresses, traffic, environmental data, and structural data were effective variables to estimate IRI. Furthermore, the previous variables and the IRI measurements were obtained from the LTPP database. The results revealed that the RFR model provided high accuracy and excellent indications on the pavement performance for the training and testing sets. The coefficients of determination R2 of the proposed model were 0.99 and 0.97 for training and testing sets, respectively. The R2 values indicate high efficiency in implementing the RFR model. Furthermore, the results indicated that various pavement distresses and pavement age significantly influenced IRI measurements, such as alligator cracking, transverse cracking, and rutting. In contrast, others showed a limited impact on IRI measurements, including edge fracture, longitudinal cracking, and drilling.

In the same way of research, another study was conducted by Marcelino et al. [26], focused on applying a random forest algorithm for the development of pavement condition performance. A long-term pavement performance data based on the IRI values for five and ten years, as well as some other indicator factors, such as traffic volume data, environmental data, and structural data. The data were conducted for different urban roads in Canada and the U.S. (Indiana, Texas, and Saskatchewan). The main variables used in this model are annual average precipitation (AAP), annual average temperature (AAT), annual average freeze index (AAFI), pavement thickness, structural number (SN), and cumulative annual average daily truck traffic. As mentioned by Marceline and other authors, this random forest algorithm can reduce the variance of the prediction model by combining different models and performing higher accuracy results. Three categories, including quantitative, qualitative, composite of qualitative, and quantitative, were used to evaluate the prediction models [27, 5961]. In addition to mean squared error (MSE), the standard deviation of mean squared error (SDMSE) and K-fold cross-validation were applied to estimate the number of errors in the predicted models (Table 2).

(2) Ordinary Least Squares OLS Methods. The development of the prediction model focused mainly on the accuracy of data sources. Arhin and Noel [62] conducted the IRI and PCI data from the Department of Transport for the selected roads in Columbia. At the same time, the ordinary least squares (OLS) regression method was performed to predict the (PCI) from IRI datasets. Additionally, Arhin and Noel [62] applied a 5% significance level to identify the significance of the proposed regression models. Subsequently, an ANOVA test was used to measure the significance of each regression model for each road classification and pavement type. The goodness of fit R2 and F-test were also tested for each regression model to estimate the validity of the proposed models. The best general regression model was formed as follows:where A, K are constants and ɛ is an associated error.

The results showed that this prediction performance method was accurate and capable of being used in different monitoring techniques. For more illustration, based on functional classification, the results revealed that freeways were a smoother ride than arterial roads, which were smoother than collectors and local roads. Based on the pavement type, the composite pavement was smoother than asphalt and concrete pavement, respectively. The R2 values of the functional classification models ranged between 0.56 and 0.74, which was relatively low, while the goodness of fit R2 values of the pavement type models ranged between 0.72 and 0.74.

(3) Simplified Regression Model. Elhadidy et al. [22] focused on creating a simplified regression model to predict the relationship between pavement condition index PCI and the international roughness index IRI. The proposed model used a database from the LTPP database in America and Canada. They used variables such as traffic levels, climate conditions, pavement age, type of pavement, and pavement distresses. Moreover, Elhadidy et al. [22] evaluated the proposed model accuracy using the coefficient of determination R2 and RMSE. The study results showed that the proposed model was accurate, with a value of R2 0.99, and it could be used to predict IRI based on PCI for any pavement segment (Table 2).

(4) Stepwise Regression Technique. Ahmed et al. [63] focused on developing a performance prediction model based on PCI using the stepwise regression technique. The study used field observations to measure the PCI values of different types of pavement distress in urban roads in Baghdad. Furthermore, different types of pavement distress were inspected and investigated to find PCI values, including fatigue cracking, rutting, potholes, bleeding, depression, slippage cracking, longitudinal cracking, and patching, see Table 2. Ahmed et al. [63] developed a prediction model for PCI depending on the type of pavement distresses. Equation (16) illustrates the proposed model.

In the aforementioned model, only three significant types of distress were mentioned, as these distresses can provide an effective impact on PCI values. The coefficient of determination R2 for the proposed model was 0.80, indicating that the model is adequate and acceptable to transport agencies and researchers. However, there were limitations to using this model, as the model can only work with specific ranges of variables. Moreover, validation based on the mean and standard deviation of the observed and developed PCI values was applied to achieve a high-precision prediction model. T-test and mean levels were also measured at 95% to determine the accuracy of the proposed model. Equation (17) presents the relationship between expected and observed PCI.

The goodness of fit of the proposed model revealed no significant difference between the observed and predicted PCI values.

Table 2 shows the previous studies that applied shallow machine learning to predict pavement performance. The table also shows the technique used for each type of pavement indices to perform the prediction. The data sources are provided with the standard matrices used to measure each developed model’s validation and accuracy. Besides, the strength and weaknesses of each model are presented and discussed.

4. Discussion and Research Gaps

The main reason for the focus on the limitations of pavement prediction models is the importance and significance of these performance models in estimating the health status of pavement degradation. Once the researchers decided to develop a performance model, they must be very careful to find and select an appropriate prediction model. They also must have clear information and adequate knowledge on the model inputs, outputs, parameters, and affective variables to be used. Forming model functions and equations is a significant step in developing any prediction model. Therefore, boundary conditions must govern the equations, depending on the performance models’ purpose.

In pavement prediction, performance models must have the growth of pavement degradation, distresses, and damages, or pavement performance indexes such as roughness index, serviceability index, and pavement condition rating [4]. Finding appropriate variables to be used in prediction models is considered one of the main constraints that faced many researchers. As known, the accuracy of any model is mainly related to the chosen variables. The selection of key variables depends mainly on the type of prediction model and the forecast condition. Moreover, the pavement condition variables are divided into the following main categories based on the conditions affecting pavement surfaces: traffic level, environmental condition, material quality, and paving structure.

Furthermore, many studies claimed that not all variables are available in the LTPP database or are easy to obtain. It is considered a significant problem for the model’s developer [30, 41, 64, 66]. At the end of the preparation, all elements, such as physical and mathematical boundaries, dependent and independent variables, and raw and prepared data, must be ready to be obtained and used in the developed prediction models. This review paper presents the most significant time-series prediction models, including mechanistic-empirical, empirical, regression, support vector machine, fuzzy logic, and others.

Figure 7 below presents the accuracy values depending on each prediction model’s confinement of determination R2 value. MEM: mechanistic-empirical models, EM: empirical models, FL: fuzzy logic, PM: probabilistic modeling, DM: deterministic models, ANNs: artificial neural network models, MLAs: machine learning algorithms, and RM: regression models. According to Figure 7, the ANNs, MLA, and RM accuracy show a high accuracy value in predicting, classifying, and detecting pavement damage conditions. Using DM offers low accuracy to predict the actual health status of pavement, especially the NCHRP models.

Mechanistic and mechanistic-empirical performance models can estimate and extrapolate the pavement performance data. Furthermore, these models need more data to be calibrated, however, at the same time, they have simplification advantages compared with other prediction models like the empirical models [10]. However, in empirical models, George [23] claimed that selecting appropriate prediction equations is significant for developing the best performance model. To develop empirical models, the researchers should have a large dataset on pavement conditions and identify mathematical and physical boundaries of the equations to build a clear and accurate model and avoid significant errors [26].

In regression models, any equation can be used in regression analysis because of the simplicity of use. The efficiency of assumed functions or equations for the development of the regression models can be measured and evaluated using statistical measures to determine the ability of the proposed model to fit the observed data [58]. However, many researchers claimed that the coefficient of determination is a fundamental tool for assessing the adequacy of the prediction model. Still, the goodness of fit can also be evaluated using other statistical measures based on the percentage of the conducted error [58, 60, 62].

On the other hand, there are some limitations of using neural network models to predict pavement performance, including the availability of data, such as traffic level, climate condition, and other pavement condition indices in the long-term pavement performance LTPP database [42]. Furthermore, one of the most significant limitations is the need for numerical verification and statistical tests to verify the accuracy of neural network models for artificial neural networks and neuro-fuzzy models [49]. Moreover, model developers usually face difficulties in obtaining pavement condition data, especially data related to PCI, and they are unable to find suitable flexible pavement with full-service life details. Ziari et al. [49] discussed another issue in conducting the data, which is about pavement condition indices. The pavement condition index values deteriorate to the worst rating class with pavement age. The pavement surface is exposed to a different climate, traffic, and other external factors that cause damages and distress. However, after applying maintenance and rehabilitation procedures, the pavement condition indices provide the best rating for pavement condition. Therefore, finding a pavement in the LTPP database with no rehabilitation or treatment process during service life is not easy [67, 68]. Besides, there is no necessity to start evaluating pavement condition from the beginning of pavement service life, where pavement age forms the last overlay.

5. Future Directions

Pavement condition performance is a promising area of pavement management research and is the future of monitoring and maintenance systems. Several studies have been conducted to measure the pavement condition performance and the future health states of road surfaces. Besides, the research doors are still open for more investigations and innovations to find the competitive ways that can predict future pavement performance, in addition, to providing enough information about the short and long forecasting the pavement condition performance. The main future direction should focus on selecting strength variables to efficiently develop accurate forecasting models.

Another future direction can be focusing on getting pavement condition data. It is essential to apply advanced methodologies and use highly accurate equipment to monitor the pavement condition and apply the most significant prediction model to diagnose the performance of the pavement. More clearly, using the dynamic and static pavement monitoring system to gain accurate assessment results may provide and enhance the outcomes of prediction models. For example, using accelerometer sensors with high frequency and sensitivity can provide consistent vibration data to develop prediction models. Also, advanced pavement monitoring instruments, such as probe vehicles with scanner laser and high-quality line scan cameras, are used to identify the pavement damages and use the data to validate the prediction models.

6. Conclusions

The increase in the number of road users results in minor and significant damage and the degradation of pavement surfaces, which mainly affects the safety and comfort of road users. However, many researchers have conducted studies to assess the current health condition of pavement degradation and future changes in the pavement structure under the recent changes. Moreover, PMS has an essential role in developing different prediction performance models to estimate the condition of the pavement surface and the severity of pavement degradation after a specified time. This review paper sets out to the IRI- and PCI-based pavement degradation prediction model. Various prediction models have been developed to estimate pavement conditions and the level of pavement damage at various flexible pavement sites around the world.

Most previous studies have focused on developing performance prediction models based on data sets from the LTTP database and pavement state indices values. In contrast, other studies have been performed based on field observations or data collection. Many performance models have been developed using ML algorithms and ANN modeling. Most researchers agreed that both prediction methods, ML and ANN, have accurate estimation results for pavement condition, and they are beneficial in dealing with variables, such as traffic conditions, pavement age, and weather conditions. In addition, regression models showed high accuracy in detecting and classifying pavement damages. At the same time, some deterministic models showed a deficiency in predicting the actual condition of pavement surfaces.

In summary, each model has specific features, strengths, and weak points. Therefore, some prediction models are strong for multiprediction and multiclassification purposes, such as ANN, ML, and RE. In contrast, other models are significant for binary classification and detection, such as the SVM, Al Omari and Darter model, and Markov model. Hence, selecting an appropriate prediction model is the first step to a high-quality prediction performance system [69].

Conflicts of Interest

The authors declare that they have no conflicts of interest.