The Application of Data Mining Technology to Build a Forecasting Model for Classification of Road Traffic Accidents

Shiau, Yau-Ren; Tsai, Ching-Hsing; Hung, Yung-Hsiang; Kuo, Yu-Ting

doi:https://doi.org/10.1155/2015/170635

Mathematical Problems in Engineering

On this page

Abstract Introduction Methods Conclusions References Copyright Related Articles

Research Article | Open Access

Volume 2015 | Article ID 170635 | https://doi.org/10.1155/2015/170635

The Application of Data Mining Technology to Build a Forecasting Model for Classification of Road Traffic Accidents

Yau-Ren Shiau,¹Ching-Hsing Tsai,¹Yung-Hsiang Hung,²and Yu-Ting Kuo²

Academic Editor: Aime’ Lay-Ekuakille

Received24 Mar 2015

Revised27 Jun 2015

Accepted28 Jun 2015

Published12 Jul 2015

Abstract

With the ever-increasing number of vehicles on the road, traffic accidents have also increased, resulting in the loss of lives and properties, as well as immeasurable social costs. The environment, time, and region influence the occurrence of traffic accidents. The life and property loss is expected to be reduced by improving traffic engineering, education, and administration of law and advocacy. This study observed 2,471 traffic accidents which occurred in central Taiwan from January to December 2011 and used the Recursive Feature Elimination (RFE) of Feature Selection to screen the important factors affecting traffic accidents. It then established models to analyze traffic accidents with various methods, such as Fuzzy Robust Principal Component Analysis (FRPCA), Backpropagation Neural Network (BPNN), and Logistic Regression (LR). The proposed model aims to probe into the environments of traffic accidents, as well as the relationships between the variables of road designs, rule-violation items, and accident types. The results showed that the accuracy rate of classifiers FRPCA-BPNN (85.89%) and FRPCA-LR (85.14%) combined with FRPCA is higher than that of BPNN (84.37%) and LR (85.06%) by 1.52% and 0.08%, respectively. Moreover, the performance of FRPCA-BPNN and FRPCA-LR combined with FRPCA in classification prediction is better than that of BPNN and LR.

1. Introduction

As the demand for vehicles rises, the number of vehicles on the road increases greatly and traffic jams worsen, especially during rush hours; thus, traffic accidents are more likely to occur. Faced with more severe accidents, the traffic problem has become a topic of concern in Taiwan. The statistics of the Ministry of Health and Welfare (2013) indicated that accidental injury is the sixth major cause of death in Taiwan, with 6,873 deaths from accidental injuries. Most traffic accidents are caused by improper driving behaviors, and one of the major reasons is that drivers failed to pay attention to the road ahead (Ministry of Transportation, 2008).

According to the statistical data of the National Police Agency (2012), the number of road traffic accidents with death in Taichung City was next to Kaohsiung City. In 2012, the number of traffic accidents causing death was 208, and the death toll was 210. In 2012, the number of traffic accidents causing death was 198, the death toll was 203, and the gradient of number of accidents was −3.41%. Due to the increase in urban population, at a growth rate of 0.76% in 2012, the occurrence rate of road traffic accidents increased accordingly. In 2011, accidental injury ranked sixth among the ten major causes of death in Taiwan, and the death toll from motor vehicle accidents was about 30, accounting for 17.3% of the death rate per 100,000 persons (MOHW, 2012). This study uses the traffic accident data from the NPA of the region from January to December 2012 as the data source. The data content includes 17 items, such as weather, light rays, road category, speed limit, road type, accident site, road conditions, and roadblocks. There are 2,471 original observations.

According to previous transportation research [1–5], the causes of traffic accidents are mostly human factors, such as speeding, violation of signals, and drunk driving, as well as the interaction between road environments and traffic engineering facilities. This study identifies the key factors that affect traffic accidents using Feature Selection and establishes models to analyze traffic accidents and their types with various methods, such as Fuzzy Robust Principal Component Analysis (FRPCA), Back Propagation Neural Network (BPNN), and FRPCA-Logistic Regression (LR). The environments of traffic accidents and the variables of road designs identified by the model could serve as reference for the police force and regulatory authorities to design and plans and improve traffic safety, thus decreasing the ratio of traffic accidents, damage to property, and loss of lives.

With the advancements of information technology, data mining becomes increasingly mature, and useful information without preconditions can be found in databases. Relational models can be built to determine the correlation between characterization factors of traffic accidents and casualties. This study uses the Recursive Feature Elimination (RFE), FRPCA, BPNN, and LR of Feature Selection to determine important factors influencing traffic accidents. The results can provide suggestions for improving the occurrence of traffic accidents. Finally, the model was statistically evaluated.

The remainder of this paper is organized as follows. Section 2 reviews the literature concerning the severity of injuries in traffic accidents; Section 3 presents the FRPCA; Section 4 discusses the research data; Section 5 offers conclusions.

2. Material and Methods

Many studies have focused on forecasting and modeling traffic accidents and analyzed the results. The results suggest that the significant factors influencing the occurrence of accidents must be eliminated or controlled in order to prevent traffic accidents and reduce injuries and deaths.

In terms of research methods, most studies use BPNN or LR to forecast or model analysis results [6–9]. Gang and Zhuping [10] suggested that the PSO-SVM is better than BPNN in traffic safety forecasting. Chang et al. [6] used the established modeling method and LR to discuss the contributing factors and conditions of driving after drinking. The analysis results showed that law enforcement, drivers’ drinking habits, and regulatory knowledge of drunk driving apparently influenced drivers’ selecting drunk driving behaviors. Kong and Yang [8] used LR for casualties and driving speed in traffic accident survey data and found that, regarding the correlation of collisions between vehicles and pedestrians, the risk of pedestrian death was 26% when the vehicle’s speed was 50 km/h, 50% when the speed was 58 km/h, and 82% when the speed was 70 km/h. However, the analysis result showed that age was not a major risk factor in death. Fu and Zhou [7] pointed out that the traditional BPNN has some defects, such as local minima, too many iterations, and too slow training. Therefore, the improved LM-BP neural network was used for forecasting. The forecast results of traffic accidents, death toll, and amount of direct economic loss were significant; thus, the BP network is applicable to traffic accident forecasting.

A number of recent studies have used data mining or statistical methods [11–15]. Karacasua and Er [14] used chi-square significance testing to analyze whether the same age and gender have similar traffic accidents, as well as the correlation among education, age, gender, and psychology. The findings showed thatmales were more prone to traffic accidents than females;driving while being intoxicated and speeding were major causes. Kanchan et al. [13] used statistical software for analysis and found that the injured were mostly male, and the major causes of death were head and abdominal injuries. Traffic accidents are a significant public health hazard; thus, first aid should be strengthened, and traffic regulations and health education should be strictly implemented. Kashani et al. [16] used classification and CART to analyze traffic collision data. The results showed that improper passing and not using seat belts were the most important factors influencing the severity of injuries. De Oña et al. [15] used Latent Class Cluster (LCC) to reduce the heterogeneity of traffic accident data and combined it with Bayesian Networks (BNS) to recognize major factors. The results indicated that weather factors, pavement markings, and road width were significant factors.

Based on the above discussion, this study uses BPNN, LR, and statistical methods differing from previous studies, which aim at accident patterns and types. The FRPCA is used for data preprocessing, which is combined with BPNN and LR models to analyze the performance of the aforesaid four classification models (BPNN, LR, FRPCA-BPNN, and FRPCA-LR) in forecasting.

2.1. Research Method

This study uses four constructs, namely,natural factors;environmental factors;road design; andaccident types and patterns of road traffic accident cases in Taichung City, to discuss the factors influencing the occurrence of road traffic accidents. The research structure is as shown in Figure 1.

2.2. Data Preprocessing of Feature Selection

This study uses RFE as the Feature Selection method, which is a Feature Selection algorithm, with the principle proposed by Guyon et al. [17]. Guyon used RFE to select the key and important feature set, which not only shortens classification computing time but also improves the classification accuracy rate. The purpose of RFE is to calculate the weight vectors of each feature, which are ordered according to the calculated weight vectors as the basis of classification. RFE is an iterative process that eliminates features backwards, and its feature set screening procedure is described as follows:(1)Use current data set to train classifier.(2)Calculate the weight of each feature.(3)Delete the feature with minimum weight.

The iterative process is ended when there is one feature remaining. A list of features ordered according to the weights is obtained as a result of execution, and unimportant or uncorrelated features are eliminated from the list first; thus, they are listed at the end, whereas, the most important features are eliminated last and are listed at the front [18, 19].

The RFE selects the feature set in three major steps, imports the data set for classification, calculates the weight of each feature, and deletes the feature with minimum weight. Feature ordering is obtained, the feature with minimum weight square is removed in each cycle, and then the remaining features are retrained to obtain a new feature ordering. RFE continuously executes this process, and a feature order list is obtained [20]. It is noteworthy that one of the features ordered in the front does not always enable the classifier to obtain the best classification performance; however, the combination of multiple features enables the classifier to obtain the optimum classification performance. Therefore, RFE algorithm can select the most complementary feature combination [4].

2.3. Backpropagation Neural Network (BPNN)

BPNN is the most frequently used supervised learning among the neural networks and is highly effective in classification problems [21]. The parameters are divided into structural parameters and learning parameters. The structural parameters include the number of hidden layers, while the learning parameters include the learning rate, initial weight range, and momentum term. Generally, Trial and Error is used to determine the optimal parameter values when selecting structural parameters and learning parameters. The most used nonlinear transfer function in the hidden layer of BPNN is the log-sigmoid transfer function, whose output is between 0 and 1, in order to respond to the negative infinity to positive infinity input of neurons.

An alternative is the tangent sigmoid transfer function “tansig,” as shown in the hidden layer. The linear transfer function purelin is in the output layer. If the sigmoid transfer function is used in the output layer, the network output is restricted to a very small range. If the linear transfer function is used in the output layer, the network output can be an arbitrary value.

2.4. LR

LR can be used to analyze one or several forecast values. These results have a binary (e.g., existence or nonexistence of an event) relationship [22, 23]. LR is derived from the cumulative probability function of the logistic model and is a linear probability model, which is similar to a linear regression model. The difference is that LR can test the dependent variable of a nominal scale, where the discussed dependent variable is discontinuous, especially in binary classification. The purpose of LR is to establish the simplest and fittest analysis result. Furthermore, it can be used in a practical model to forecast the relationships between dependent variables and a set of forecast variances, where the explanatory variable can be a categorical or continuous variable.

2.5. FRPCA

The nonlinear FRPCA algorithm is deduced from the linear fuzzy principal component analysis algorithm, as introduced by Yang and Wang [24], and the nonlinear criteria in blind source separation of Karhunen et al. [25]. The robust principal component analysis, as proposed by Yang and Wang, is established on the principal component analysis learning rule and energy function, as proposed by Xu and Yuilles [26], and the objective function bias is proposed. These methods are briefly introduced as follows. Xu and Yuilles [26] proposed the optimal function of constraint: The objective is to minimize, whereis the data set,is the membership set, η is the threshold,is the binary variable, andis the continuous variable, rendering gradient descent method optimization difficult to solve; thus, they transformed the minimization problem, where Gibbs distribution is maximized by the following equation:whereis the separation function; ensuring,can be one of the following functions: The gradient descent rule for minimizing and is where is the learning rate, .

Therefore, Yang and Wang [24] proposed a new objective function: The constraints are,, whereis the membership value belonging to thedata cluster,is the membership value of thedisturbance cluster, andis the fuzzy variable. In this case,is the error between the measuredand the cluster center, which is similar to the -means algorithm [27].

Asis a continuous variable, it avoids the difficulty of an optimum mix of discrete types and continuous types; thus, the gradient descent method can be used. First,equals 0 as calculated by the slope of (2), so is replaced in (2), and the following equation is obtained: On the other hand,gradient is is the fuzzy variable. If, the fuzzy membership is demoted to a fixed membership and can be determined by the following rule:

In this case, η is the hard threshold, whereis not set, butin most studies. Yang and Wang [24] deduced the following process of an optimization algorithm.(1)The number of iterations is set as, the iteration is constrained as, the learning coefficient is, the soft threshold η is a small positive, and the weightis randomly initialized.(2)In a case of less than, execute step (3) to step (9).(3)Calculate; setand.(4)For observation frequency, execute step (5) to step (8).(5)Calculate,and.(6)The new weight is.(7)The new temporary count is.(8)Add from 1 to.(9)Calculateand add from 1 to.It is almost affirmative that the new weightapproaches the principal component vector [19, 27, 28].

3. Case Study

This section is divided into three parts: collection of traffic accident data, preprocessing of the research data, and substituting the data after Feature Selection in FRPCA, BPNN, and LR. Four groups of information are obtained, including BPNN, LR, FRPCA-BPNN, and FRPCA-LR.

3.1. Data Collection

The data variables used in this study are 17 input variables, including weather, light rays, road pattern, accident site, sight distance, and separation facility; there are 2,471 observations. The output variables are accident types and patterns, including vehicle-vehicle accidents; person-automobile/motorcycle; and automobile/motorcycle data variables. There are 2,096 observations of Categoryvehicle-vehicle; there are 146 observations of Categoryperson-automobile/motorcycle; and there are 229 observations of Categoryautomobile/motorcycle. The aforesaid variables are coded as , , and , respectively, as listed in Table 1.

3.2. Feature Selection Result

This study uses RFE for data preprocessing. The 17 variables of the database are coded before Feature Selection, as shown in Table 2. The 17 variables are sequenced according to importance, as shown in Table 3.

4. Empirical Research Result

This study uses the first 7 variables in order of importance, as obtained by the Feature Selection of RFE, as input variables, while the person-automobile/motorcycle, vehicle-vehicle, and automobile/motorcycle are output variables, as shown in Table 4. The 7 input variables and 3 output variables are substituted in FRPCA, BPNN, and LR, respectively, in order to obtain BPNN, LR, FRPCA-BPNN, and FRPCA-LR classifier models. The experimental procedure is as follows: Step: 2,471 data sets are used as test data, the 17 variables are sequenced according to their importance by using the RFE data preprocessing method, and the first 7 variables are rearranged as the test data set (). Step: classifier modeling and performance evaluation, the BPNN, LR, FRPCA-BPNN, and FRPCA-LR classifiers are tested, respectively, in order to evaluate the performance of the classifiers during classification.

4.1. Analysis of BPNN Model

The 7 input variables are substituted in FRPCA in order to obtain the scores and loadings of the principal components. The scores of the principal components can be used to classify various observation points and to integrate the scores of the principal components of each observation point in order to calculate an average weighted integral indicator. The coefficient of the correlation between new variables and old variables is called loadings, which represent the influence or importance of the original variable to the new variables, where larger loadings represent higher influence.

The loadings can be obtained fromwhere is the loadings of No. variable on No.principal component, is the weight of No. variable on No. principal component, is the eigenvalue of No. principal component (i.e., variance), and is the standard deviation of No. variable.

Experimental combination : the experimental parameters of the BPNN forecasting model are set as follows: epochs = 1000, learning rate lr is 0.1, 0.3, and 0.5, respectively, and each experiment is repeated 5 times, with the results of the three experiments as shown in Table 5. Experimental combination : the FRPAC-BPNN forecasting model is different from experimental combination 1, where the principal component scores are converted by executing FRPCA before building BPNN, and then the BPNN forecasting network is built. The experimental results show the learning rate of the lr = empirical results of BPNN versus FRPCA-BPNN forecasting models.

4.2. LR Analysis

Experimental combination 3: the LR classifier is constructed, and the data are imported into LR for classification forecasting according to the aforesaid test data set. Experimental combination 4: the FRPCA-LR classifier is constructed as experimental combination 2, thedata set is converted into principal component scores by FRPCA, and then the LR classifier is constructed. Each experiment is conducted five times; the experimental results are as shown in Table 6. The experimental results show that the average accuracy rate and standard deviation of the FRPCA-LR model are, which is better than theof the LR classification model.

The LR model investigates the impacts on the pattern of traffic accidents according to the types and patterns of the traffic accidents. The optimal model is shown in Table 7. This section discusses the correlation between the vehicles and the environment, based on 2,325 pieces of data for analysis. After the deletion of 146 pieces of data involving human and vehicles, the dependent variables are divided into the two categories of “vehicle to vehicle” and “vehicle in itself” according to the types and patterns of traffic accidents. Odd ratio is adopted to represent the relevant influences of Event A to the occurrence of Event B. In Table 7, the odd ratio of crossroads among the road types is 3.01, meaning that the risk of traffic accidents on “crossroads” is higher than the 1.38 of “one-way roads”,. In other words, the ratio of traffic accidents at intersections is 3.24 times that of one-way roads. Likewise, the ratio of traffic accidents on “intersections” under the category of “traffic locations” is also the highest, at 7.24 times that of general roads. Table 7 shows the importance of traffic accident environments and road design to the ratio of traffic accidents.

5. Conclusions

Traffic safety depends on road design, road configuration, vehicle performance, traffic regulations, and the effectiveness of implementation. The main means of transport in middle and low income countries include walking, bicycle, motorcycle, and bus, while that of high income countries is automobiles. Therefore, the traffic safety control measures of high income countries are not completely applicable to middle and low income countries and thus should be imported and improved to fit local transportation and road usage conditions [29].

The report of the World Health Organization (WHO) indicated that about 1.2 million people die from traffic accidents in the world annually; about 3,400 people die from traffic accidents per day; approximately 1,000 people are injured or disabled; children, pedestrians, cyclers, and the elderly are the most vulnerable road users; and 85% of fatalities and 90% of the disabled live in middle and low income countries. The scientific analysis of accident data, as well as the implementation of relevant safety measures, can prevent the occurrence of traffic accidents, thus, reducing the severity of injuries.

At present, with the rapid development of cities, it is necessary to make efficient forecasting in order that decision makers can make preventions and decisions in advance in order to reduce the death rate. This study uses RFE, FRPCA, BPNN, and LR to analyze the classification accuracy rate of the traffic accident data of the region. According to the experimental results, the classification accuracy rate of BPNN, LR, FRPCA-BPNN, and FRPCA-LR is higher than 80%; thus, forecast performance is significant. Further analysis shows that the network performance of the FRPCA-BPNN and FRPCA-LR classifiers, combined with FRPCA, is better than BPNN and LR. According to Tables 5 and 6, the accuracy rate of classifiers FRPCA-BPNN (85.89%) and FRPCA-LR (85.14%), combined with FRPCA, is higher than BPNN (84.37%) and LR (85.06%) by 1.52% and 0.08%, respectively, meaning the FRPCA-BPNN and FRPCA-LR have better classification forecast ability.

In traffic accident analysis or verification results, the human factor is mostly regarded as the first cause of traffic accidents. However, the road environment has certain correlation, and improper intersection design or planning is likely to cause traffic accidents. In comparison to other traffic accident sites, a forked road intersection is the most probable accident site. This study used RFE to select 7 input variables from 17 input variables. Based on the 7 input variables, the environmental factor and road design are found to be the causes of road traffic accidents in the region. According to the statistical data of the Taichung Police Station, the 4 main causes among the 67 causes of accidents are as follows: not allowing other vehicles to pass as per regulations, not aware of the situation ahead, violating specific sign (line) bans, and not maintaining a safe driving distance, accounting for 23.72%, 13.54%, 7.51%, and 8.22%, respectively, of the total number of traffic accidents, and the total proportion is as high as 52.99%. The road authorities may refer to the 7 variables of traffic accidents and road design, as proposed in this study, regarding future road designs and plans. As for the 4 main causes of accidents on road sections involving the above seven variables of traffic accidents and road design, they should be the priorities in the future elimination actions of the police force. If improvements and preventive measures are made, road safety can be substantially increased, thereby reducing traffic accidents and fatalities. The findings can serve as reference for the police force and management authorities to improve roads, as well as the assessment and management models for the elimination of traffic offences.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

D. D. Clarke, P. Ward, C. Bartle, and W. Truman, “Killer crashes: fatal road traffic accidents in the UK,” Accident Analysis & Prevention, vol. 42, no. 2, pp. 764–770, 2010.
View at: Publisher Site | Google Scholar
K.-V. Hung and L.-T. Huyen, “Education influence in traffic safety: a case study in Vietnam,” IATSS Research, vol. 34, no. 2, pp. 87–93, 2011.
View at: Publisher Site | Google Scholar
H. Gjerde, A. S. Christophersen, P. T. Normann, and J. Mørland, “Toxicological investigations of drivers killed in road traffic accidents in Norway during 2006–2008,” Forensic Science International, vol. 212, no. 1–3, pp. 102–109, 2011.
View at: Publisher Site | Google Scholar
Y. Komada, S. Asaoka, T. Abe, and Y. Inoue, “Short sleep duration, sleep disorders, and traffic accidents,” IATSS Research, vol. 37, no. 1, pp. 1–7, 2013.
View at: Publisher Site | Google Scholar
S. A. Shappell and D. A. Wiegmann, “Human factors investigation and analysis of accidents and incidents,” in Encyclopedia of Forensic Sciences, pp. 440–449, Academic Press, 2nd edition, 2013.
View at: Google Scholar
L.-Y. Chang, D.-J. Lin, C.-H. Huang, and K.-K. Chang, “Analysis of contributory factors for driving under the influence of alcohol: a stated choice approach,” Transportation Research Part F: Traffic Psychology and Behaviour, vol. 18, pp. 11–20, 2013.
View at: Publisher Site | Google Scholar
H. Fu and Y. Zhou, “The traffic accident prediction based on neural network,” in Proceedings of the 2nd International Conference on Digital Manufacturing and Automation (ICDMA '11), pp. 1349–1350, IEEE, Zhangjiajie, Hunan, August 2011.
View at: Publisher Site | Google Scholar
C.-Y. Kong and J.-K. Yang, “Logistic regression analysis of pedestrian casualty risk in passenger vehicle collisions in China,” Accident Analysis and Prevention, vol. 42, no. 4, pp. 987–993, 2010.
View at: Publisher Site | Google Scholar
Y.-K. Ou, Y.-C. Liu, and F.-Y. Shih, “Risk prediction model for drivers' in-vehicle activities—application of task analysis and back-propagation neural network,” Transportation Research Part F: Traffic Psychology and Behaviour, vol. 18, pp. 83–93, 2013.
View at: Publisher Site | Google Scholar
R. Gang and Z. Zhuping, “Traffic safety forecasting method by particle swarm optimization and support vector machine,” Expert Systems with Applications, vol. 38, no. 8, pp. 10420–10424, 2011.
View at: Publisher Site | Google Scholar
S. S. Durduran, “A decision making system to automatic recognize of traffic accidents on the basis of a GIS platform,” Expert Systems with Applications, vol. 37, no. 12, pp. 7729–7736, 2010.
View at: Publisher Site | Google Scholar
K. El-Basyouny and T. Sayed, “Safety performance functions using traffic conflicts,” Safety Science, vol. 51, no. 1, pp. 160–164, 2013.
View at: Publisher Site | Google Scholar
T. Kanchan, V. Kulkarni, S. M. Bakkannavar, N. Kumar, and B. Unnikrishnan, “Analysis of fatal road traffic accidents in a coastal township of South India,” Journal of Forensic and Legal Medicine, vol. 19, no. 8, pp. 448–451, 2012.
View at: Publisher Site | Google Scholar
M. Karacasua and A. Er, “An analysis on distribution of traffic faults in accidents, based on driver's age and gender: Eskisehir case,” Procedia–Social and Behavioral Sciences, vol. 20, pp. 776–785, 2011.
View at: Publisher Site | Google Scholar
J. De Oña, G. López, R. Mujalli, and F. J. Calvo, “Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks,” Accident Analysis & Prevention, vol. 51, pp. 1–10, 2013.
View at: Publisher Site | Google Scholar
A. T. Kashani, S. M. Afshin, and A. Ranjbari, “Analysis of factors associated with traffic injury severity on rural roads in Iran,” Journal of Injury and Violence Research, vol. 4, no. 1, pp. 36–41, 2012.
View at: Google Scholar
I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, no. 1–3, pp. 389–422, 2002.
View at: Publisher Site | Google Scholar
C. Chu, A.-L. Hsu, K.-H. Chou, P. Bandettini, C. Lin, and Alzheimer's Disease Neuroimaging Initiative, “Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images,” NeuroImage, vol. 60, no. 1, pp. 59–70, 2012.
View at: Publisher Site | Google Scholar
X.-H. Lin, F.-F. Yang, L. Zhou et al., “A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information,” Journal of Chromatography B, vol. 910, pp. 149–155, 2012.
View at: Publisher Site | Google Scholar
D. Wei, S. Li, and M.-K. Tan, “Graph embedding based feature selection,” Neurocomputing, vol. 93, pp. 115–125, 2012.
View at: Publisher Site | Google Scholar
M. Ahmadzadeh, A. H. Fard, B. Saranjam, and H. R. Salimi, “Prediction of residual stresses in gas arc welding by back propagation neural network,” NDT & E International, vol. 52, pp. 136–143, 2012.
View at: Publisher Site | Google Scholar
P. Reed and Y.-Q. Wu, “Logistic regression for risk factor modelling in stuttering research,” Journal of Fluency Disorders, vol. 38, no. 2, pp. 88–101, 2013.
View at: Publisher Site | Google Scholar
P. Reed, “The effect of traffic tickets on road traffic crashes,” Accident Analysis & Prevention, vol. 64, pp. 86–91, 2014.
View at: Publisher Site | Google Scholar
T.-N. Yang and S.-D. Wang, “Robust algorithms for principal component analysis,” Pattern Recognition Letters, vol. 20, no. 9, pp. 927–933, 1999.
View at: Publisher Site | Google Scholar
J. Karhunen, P. Pajunen, and E. Oja, “The nonlinear PCA criterion in blind source separation: relations with other approaches,” Neurocomputing, vol. 22, no. 1–3, pp. 5–20, 1998.
View at: Publisher Site | Google Scholar
L. Xu and A. L. Yuille, “Robust principal component analysis by self-organizing rules based on statistical physics approach,” IEEE Transactions on Neural Networks, vol. 6, no. 1, pp. 131–143, 1995.
View at: Publisher Site | Google Scholar
F. Gharibnezhad, L. E. Mujica, J. Rodellar, and C.-P. Fritzen, “Damage detection using robust fuzzy principal component analysis,” UPCommons Conference Report, 2013.
View at: Google Scholar
P. Luukka, “Nonlinear fuzzy robust PCA algorithms and similarity classifier in bankruptcy analysis,” Expert Systems with Applications, vol. 37, no. 12, pp. 8296–8302, 2010.
View at: Publisher Site | Google Scholar
G.-G. Zhang, K. K. W. Yau, and G. Chen, “Risk factors associated with traffic violations and accident severity in China,” Accident Analysis & Prevention, vol. 59, pp. 18–25, 2013.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2015 Yau-Ren Shiau et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2446

Downloads

1296

Citations