Abstract

Estimating the cycle time of each job in a wafer fabrication factory is a critical task to every wafer manufacturer. In recent years, a number of hybrid approaches based on job classification (either preclassification or postclassification) for cycle time estimation have been proposed. However, the problem with these methods is that the input variables are not independent. In order to solve this problem, principal component analysis (PCA) is considered useful. In this study, a classifying fuzzy-neural approach, based on the combination of PCA, fuzzy c-means (FCM), and back propagation network (BPN), is proposed to estimate the cycle time of a job in a wafer fabrication factory. Since job classification is an important part of the proposed methodology, a new index is proposed to assess the validity of the classification of jobs. The empirical relationship between the value and the estimation performance is also found. Finally, an iterative process is employed to deal with the outliers and to optimize the overall estimation performance. A real case is used to evaluate the effectiveness of the proposed methodology. Based on the experimental results, the estimation accuracy of the proposed methodology was significantly better than those of the existing approaches.

1. Introduction

The competition in the semiconductor industry has been very intense. How to obtain and maintain the competitive edge is an important task for all manufacturers in this industry. Quick response and on-time delivery are obviously pressing needs for any modern enterprise. To this end, accurate estimating and shortening the cycle time (flow time or manufacturing lead time) of each job in the factory is a prerequisite [19]. In a wafer fabrication factory, a job is usually composed of about 25 pieces of wafers and has hundreds of steps to be processed. In addition, a job may visit the same workstation more than once because the same operation may be needed multiple times. A wafer fabrication factory is therefore classified as a complicated reentrant production system.

Estimating the cycle time of each job in a wafer fabrication factory is very important to the factory because it can signal the manager if the orders are progressed as they were expected. For example, if the estimated cycle time of a job is longer than as it was expected, then this order may not be completed to the customer before its due date. Some production control actions should then immediately be taken to accelerate the progress of the job [10]. That is why this paper studies the estimation of job cycle time in a wafer fabrication factory.

The existing approaches for the job cycle time estimation in a wafer fabrication factory can be classified into six categories: statistical analysis, production simulation (PS), back propagation network (BPN), case-based reasoning (CBR), fuzzy modeling methods, and hybrid approaches [9]. Among the six approaches, statistical analysis is the easiest, quickest, and most prevalent in practical applications. Most of the statistical analyses used linear regression equations to estimate the job cycle time (e.g., [11, 12]). Pearn et al. [13] fitted the distribution of the waiting time of a job with a gamma distribution and then used a linear equation to estimate the job cycle time. Recently, Chien et al. [14] used nonlinear regression equations instead and then found out the relationship between the estimation error and some factory conditions and job attributes with a BPN to further improve the estimation accuracy. The major disadvantage of statistical analysis is the lack of estimation accuracy [9]. Conversely, a huge amount of data and lengthy simulation time required are two disadvantages of PS. Nevertheless, theoretically PS is the most accurate job cycle time estimation approach if the simulation model is completely valid and is continuously updated.

Considering effectiveness (estimation accuracy) and efficiency (execution time) simultaneously, Chang et al. [8], Chang and Hsieh [15], and Sha and Hsu [16] estimated the cycle time of a job in a wafer fabrication factory using a BPN with a single hidden layer. A BPN is an effective tool in modeling complex physical systems described by sets of different equations for prediction, control, and design purposes. Compared with some statistical analysis approaches, the average estimation accuracy measured with root mean squared error (RMSE) was considerably improved with the BPNs. For example, an improvement of about 40% in RMSE was achieved in the study of Chang et al. [8]. Chen [17] incorporated the job releasing plan of the wafer fabrication factory into a BPN, and constructed a “look-ahead” BPN for the same purpose, which led to an average reduction of 12% in RMSE. On the other hand, much less time and fewer data are required with a BPN than with PS. Chen et al. [18] and Beeg [19] estimated the cycle time of a job in a ramping up wafer fabrication factory. In their studies, Chen et al. used a BPN-based method, while Beeg tried to find out the impact of utilization for the cycle time.

Chiu et al. [20] established an expert system based on CBR for the job cycle time estimation. To effectively consider the uncertainty in the job cycle time, fuzzy logic was used in a number of studies. For example, Chang et al. [8] modified the first step (i.e., partitioning the range of each input variable into several fuzzy intervals) of the fuzzy modeling method proposed by Wang and Mendel [21], called the WM method, with a simple genetic algorithm (GA) and proposed the evolving fuzzy rule (EFR) approach to estimate the cycle time of a job in a wafer fabrication factory. Their EFR approach outperformed CBR and BPN in the estimation accuracy. Chen [9] constructed a fuzzy back propagation network (FBPN) that incorporated expert opinions to modify the inputs of the FBPN. Chen’s FBPN surpassed the crisp BPN especially with respect to efficiency.

In recent years, a number of hybrid approaches have been proposed, most of which classified jobs before estimating the cycle times. For example, Chen [7] combined self-organization map (SOM) and WM, in which jobs were classified using a SOM before estimating the cycle times of the jobs with WM. Chen and Wang [22] constructed a look-ahead k-means- (kM-) FBPN for the same purpose and discussed in detail the effects of using different look-ahead functions. More recently, Chen [17] proposed the look-ahead SOM-FBPN approach for the job cycle time estimation in a semiconductor factory [23]. Besides, a set of fuzzy inference rules were also developed to evaluate the achievability of a cycle time forecast. Subsequently, Chen [24] added a selective allowance to the cycle time estimated using the look-ahead SOM-FBPN approach to determine the intermediate due date. Further, Chen et al. [23] showed that the suitability of combining the SOM and FBPN for the data could be improved with the feedback of the estimation error by the FBPN to adjust the classification results of the SOM. Chen et al. [25] proposed a postclassification fuzzy-neural approach in which a job was not pre-classified but rather postclassified after estimating the cycle time. Experimental results showed that the postclassification approach was better than the preclassification approaches in certain cases. In order to combine the advantages of preclassifying and post-classifying approaches, Chen [26] proposed a bi-directional classifying approach, in which jobs are not only pre-classified but also postclassified. Except few studies in which the historical data of a real semiconductor factory were collected, most studies in this field used simulated data [27].

In short, the followings have not done before:(1) Some factors used to estimate the cycle time are dependent on each other, which may cause problems in classifying jobs and in fitting the relationship between the job cycle time and these factors. However, this issue has rarely been addressed in previous studies of this field.(2) Job classification has been shown to be conducive to the estimation performance. However, most past studies chose classifiers subjectively and did not evaluate the performance of the classifier. Needless to say optimizing the classifier for the subsequent estimation task.

Principal component analysis (PCA) is a multivariate statistical analysis method. This method constructs a series of linear combinations of the original variables to form a new variable, so that these new variables are unrelated to each other as much as possible, and the relationship among them can be reflected in a better way. In this study, a fuzzy-neural approach, based on the combination of PCA, FCM, and BPN, is proposed to estimate the cycle time of a job in a wafer fabrication factory. The motivation of this study is explained as follows.(1) While in the past some studies combined PCA and FCM, the references on the combination of PCA, FCM, and BPN are still very limited. Chen [28] applied PCA to modify the inputs to a BPN for the job cycle time estimation. The estimation accuracy of PCA-BPN was slightly better than that of BPN alone. It seems that BPN can solve the dependencies of the input variables for the job cycle time estimation problems. PCA seems to be more important for the classification of jobs. This provides us with a motivation to improve the existing job cycle time estimation methods based on job classification.(2) FCM, as a part of the preclassifying approach, cannot be evaluated alone. Its success depends on the performance of the subsequent estimation task. This provides us with a motivation to assess the validity of the classification of jobs from this point of view.(3) The test is a commonly used method to determine the best number of categories in FCM. However, whether this way directly favors the estimation performance has not been confirmed.

The contribution compared with some previous works in the literature includes the following.(1) With factors that are dependent on each other, jobs may be misclassified if FCM is used alone. This may be harmful to the estimation accuracy of BPN, because incorrect examples are used to train the BPN. The fuzzy-neural approach replaces the original factors with new independent factors and is expected to be able to generate the correct classification results. The correctness of the classification results must be judged from the estimation performance. In order to measure that, two new indexes are defined.(2) It is anticipated that the new factors found out by PCA have a more explicit relationship with the job cycle time. As a result, the training of BPN may be accelerated. This also means that a more accurate relationship between the factors and the cycle time can also be generated with the same time.(3) A new index is proposed to assess the validity of the classification of jobs. (4) The empirical relationship between the value and the estimation performance is found.(5) Outliers, that is, jobs that cannot be classified definitely, have not been dealt with properly in the past. However, the overall estimation performance is often affected by the outliers. For this reason, an iterative process is established in this study, which can optimize the overall estimation performance.

The differences between the proposed methodology and the previous methods are summarized in Table 1.

The remainder of this paper is organized as follows. Section 2 introduces the proposed PCA-FCM-BPN approach. An example is employed to illustrate the proposed methodology. A case with the real data from a wafer fabrication factory is investigated in Section 3. The performance of the proposed methodology is compared with those of the existing approaches for this real case. Based on the results, some points are made in analysis. Finally, the concluded remarks with a view to the future are given in Section 4.

2. Methodology

Two characteristics of the proposed methodology are input replacement and job classification. These features not are mathematical skills, but also have implications for the operations of a wafer fabrication factory. First, in the useful information for the estimation of the job cycle time, many factors are in fact mutually dependent. For example, it is well known that the utilization of a factory increases when the work-in-process (WIP) level in the factory rises. Both utilization and the WIP level are important factors considered in some job cycle time estimation approaches. Whether the dependence of the factors will lead to problems in the classification of jobs needs to be checked. Therefore, the replacement of these factors with new independent variables is worth a try.

On the other hand, a number of job cycle time estimation approaches in this field classify jobs. A well-known concept is that the cycle time of a job is proportional to the WIP level of the factory, according to Little’s law; however, that only holds when the factory utilization is 100%. Therefore, it is reasonable to divide jobs into two categories: jobs that are released into the factory when the factory utilization is 100% and jobs released when the factory utilization is less than 100%.

The architecture of the proposed methodology is shown in Figure 1.

2.1. Variable Replacement Using PCA

First, PCA is used to replace the inputs to the FCM-BPN. The combination of PCA and FCM has proven to be a more effective classifier than FCM alone [29]. PCA consists of the following steps.(1) Raw data standardization: to eliminate the difference between the dimensions and the impact of large numerical difference in the original variables, The original variables are standardized as the following: where is the th attribute of job ;   and indicate the mean and standard deviation of variable , respectively.(2) Establishment of the correlation matrix : where is the standardized data matrix. The eigenvalues and eigenvectors of are calculated and represented as and , respectively, .(3) Determination of the number of principal components: the variance contribution rate is calculated as: where , and the accumulated variance contribution rate is where . Choose the smallest value such that %. A Pareto analysis chart can be used to compare the percent variability explained by each principal component.(4)Formation of the following matrixes: is the component scores, which contain the coordinates of the original data in the new coordinate system defined by the principal components, and will be used as the new inputs to the FFNN.

To illustrate the application of the proposed methodology, an example is given in Table 2. To get a quick impression of the data, a box plot is made in Figure 2. Note that there is substantially more variability in , , and than in the remaining variables.

Subsequently, we standardize the data (see Table 3) and obtain the correlation matrix as

The eigenvalues and eigenvectors of are calculated as the following: respectively. The variance contribution rates are Summing up ’s, we obtain the following A Pareto analysis chart is used to compare the percent variability explained by each principal component (see Figure 3). There is a clear break in the amount of variance accounted for by each component between the first and the second components. However, that component by itself can only explain less than 50% of the variance, so more components may be needed. To meet the requirement ,   is chosen as 3. We can see that the first three principal components explain roughly 80% of the total variability in the standardized data, so that might be a reasonable way to reduce the dimensions in order to visualize the data.

Subsequently, the component scores are computed (see Table 4), which contain the coordinates of the original data in the new coordinate system defined by the principal components and will be used as the new inputs to the FCM-BPN. In Figure 4, the first two columns of the component scores are plotted, showing the data projected onto the first two principal components.

2.2. Classifying Jobs Using FCM

After employing PCA, examples are then classified using FCM. If a crisp clustering method is applied instead, then it is very likely that some clusters will have very few examples. In contrast, an example belongs to multiple clusters to different degrees in FCM, which provides a solution to this problem. Similarly, in probability theory, the naïve Bayes method provides the probability that the item belongs to each class. However, the application of FCM can consider subjective factors in classifying the jobs.

FCM classifies jobs by minimizing the following objective function: where is the required number of categories; is the number of jobs; indicates the membership that job belongs to category ; measures the distance from job to the centroid of category ; is a parameter to adjust the fuzziness and is usually set to 2. The procedure of FCM is described as follows.(1) Normalize the input data.(2) Produce a preliminary clustering result.(3) (Iterations) Calculate the centroid of each category as the following: where is the centroid of category . is the membership that job belongs to category after the th iteration.(4) Remeasure the distance from each job to the centroid of each category and then recalculate the corresponding membership.(5) Stop if the following condition is met. Otherwise, return to step : where is a real number representing the threshold for the convergence of membership.

The performance of FCM is highly affected by the settings for the initial values, and therefore can be repeated multiple times in order to find the optimal solution. Finally, the separate distance test ( test) proposed by Xie and Beni [30] can be applied to determine the optimal number of categories as follows: subject to The value minimizing determines the optimal number of categories.

The Fuzzy Logic Toolbox of MATLAB can be used to implement the FCM approach. A sample code is shown in Algorithm 1.

A 0 3857  7175  0 5381 0 1614  0 4281  0 5803
c 2
center U obj_fun fcm A c
Jm min obj_fun
e2_min 9999
for  i 1 c
for  j i 1 c
  e2_sum 0
  for  k 1
   e2_sum e2_sum center i k center j k 2
  end
  if  e2_sum e2_min
   e2_min e2_sum
  end
end
end
e2_min
S min Jm 40*e2_min

In the illustrative example, the data have been standardized and therefore are not normalized again. The results of the test are summarized in Table 5. In this case, the optimal number of job categories was 5. However, there will be some categories with very few jobs. For this reason, the second best solution is used, that is, 4 categories. A common practice is to set a threshold of membership to determine whether a job belongs to each category. For example, if , then the classifying results are shown in Table 6. With the decrease in the threshold, each category will contain more jobs. For example, if , then the classifying results are shown in Table 7. Such a property can solve the problem of an insufficient number of examples.

We also note that the classification results are very different according to the new variables, compared with the results based on the original variables. In other words, the results of FCM and PCA-FCM are not the same.(1) The optimal number of categories in FCM is 6, while that in PCA-FCM is 5.(2) If jobs are divided into four categories in these two methods, then the results are compared in Figure 5. Many jobs have been reclassified, which means that the misclassification problem has been resolved after variable replacement.

In Figure 5, there are also some outliers that cannot be classified into any category.

2.3. Estimating the Cycle Time Using BPN

Finally, the jobs/examples of a category are learned with the same BPN. Artificial neural networks have been proposed to solve a wide variety of problems usually characterized by sets of different equations. Although there have been some more advanced artificial neural networks, such as compositional pattern-producing network, cascading neural network, and dynamic neural network, a well-trained BPN with an optimized structure can still produce very good results. The configuration of the BPN is established as follows.(1) Inputs: the new factors determined by PCA associated with the th example/job. These factors have to be partially normalized so that their values fall within [0.1, 0.9] [18]. (2) Single hidden layer: generally one or two hidden layers are more beneficial for the convergence property of the BPN.(3) For simplicity, the number of neurons in the hidden layer is twice that in the input layer. An increase in the number of hidden-layer nodes lessens the output errors for the training examples, but increases the errors for novel examples. Such a phenomena is often called “over-fitting.” There has been some research considering the relation among the complexity of a BPN, the performance for the training data and the number of examples, for example using Akaike’s information criterion (AIC) or the minimum description length (MDL). (4) Output: the (normalized) cycle time estimate of the example.

The procedure for determining the parameter values is now described. After preclassification, a portion of the adopted examples in each category is fed as “training examples” into the BPN to determine the parameter values for the category. Two phases are involved at the training stage. At first, in the forward phase, inputs are multiplied with weights, summated, and transferred to the hidden layer. Then activated signals are outputted from the hidden layer as: where ’s are also transferred to the output layer with the same procedure. Finally, the output of the BPN is generated as: where The output is compared with the normalized step flow time , for which RMSE is calculated as the following: Subsequently in the backward phase, some algorithms are applicable for training a BPN, such as the gradient descent algorithms, the conjugate gradient algorithms, the Levenberg-Marquardt algorithm, and others. In this study, the Levenberg-Marquardt algorithm is applied. The Levenberg-Marquardt algorithm was designed for training with the second-order speed without having to compute the Hessian matrix. It uses approximation and updates the network parameters in a Newton-like way, as described below.

The network parameters are placed in vector ]. The network output can be represented with . The objective function of the BPN is to minimize RMSE or equivalently the sum of squared error (SSE):

The Levenberg-Marquardt algorithm is an iterative procedure. In the beginning, the user should specify the initial values of the network parameters . Let be a common practice. In each step, the parameter vector is replaced by a new estimate , where Δ, Δ, ΔΔ, Δ, , Δ, Δ]. The network output becomes ) that is approximated by its linearization as where is the gradient vector of with respect to . Substituting (21) into (20) leads to. When the network reaches the optimal solution, the gradient of SSE with respect to will be zero. Taking the derivative of with respect to and setting the result to zero gives the following: where is the Jacobian matrix containing the first derivative of network error with respect to the weights and biases. Equation (24) includes a set of linear equations that can be solved for .

In the illustrative example, 3/4 of the examples in each category are used as the training example. The remaining 1/4 is left for testing. A three-layer BPN is then used to estimate the cycle time of jobs in each category according to the new variables with the following setting.Single hidden layer.The number of neurons in the hidden layer: .Convergence criterion: or 10000 epochs have been run.

For an outlier, the BPNs of all categories are applied to estimate the cycle time. The Neural Network Toolbox of MATLAB is used to implement the BPN approach. The sample code is shown in Algorithm 2. The estimation accuracy can be evaluated with mean absolute error (MAE), mean absolute percentage error (MAPE), and RMSE. The estimation performances are summarized in Table 8.

tn_input 0 843  0 831   0 839  0 859   0 9  0 9   0 878  0 889   0 875  0 858  
0 822  0 827  
tn_target 0 849  0 849  
net newff 0  1 0  1 0  1 0  1 0  1 0  1 12 1 logsig logsig trainlm
net init net
net trainParam show 10
net trainParam lr 0 1
net trainParam epochs 1000
net trainParam goal 1e4
net tr train net tn_input tn_target
tn_output sim net tn_input
te_input 0 825  0 844 0 824  0 835   0 9  0 9   0 878  0 889   0 883  0 875  
0 807  0 820  
te_output sim net te_input

Obviously, the overall estimation performance is affected by the outliers. If the outliers can be dealt with properly, the overall estimation will be improved. To this end, an iterative feedback control procedure is established in the next subsection (see Figure 6), which can optimize the overall estimation performance. In the literature, there have been a few control mechanisms for various types of fuzzy systems [3139]. On the other hand, we also compare the performances of the gradient descent algorithm and the Levenberg-Marquardt algorithm, as shown in Table 9.

2.4. Iterative Optimization
2.4.1. The Effectiveness of the Test

Job classification in the proposed methodology is based on the combination of FCM (or PCA-FCM) and the test, according to which the best number of categories is chosen. This classification method takes into account only the similarity of the parameters of jobs. Whether it has a decisive impact for the subsequent cycle time estimation is not clear. For this reason, the cycle time estimation performances with different numbers of categories are compared to verify the results from the test. The results are shown in Figure 7. -axis is provided in a logarithmic scale to make the relationship clearer. Clearly, when the value becomes smaller, the estimation error (in terms of MAPE) is also reduced. Therefore, choosing the clustering results with the smallest value is helpful to the estimation accuracy.

2.4.2. The Correctness of Job Classification

There are absolute rules for the classification of jobs in a wafer fabrication factory. It usually depends on the purpose of job classification, apparently to enhance the estimation accuracy in the proposed methodology. Therefore, a job is correctly classified if its cycle time is accurately estimated after classification. Otherwise, the job is misclassified.

Definition 1 (job misclassification). Assuming the cycle time of job estimated by the BPN of category is indicated with . The category of job determined by classifier is . Then, job is correctly classified if A strong requirement of inequality (25) is while a weak requirement of this inequality is

Definition 2 (the correctness of classifying a job). The degree that job is correctly classified by classifier is

Definition 3 (the correctness/correct percentage of the classification results). The correctness/correct percentage of the classification results by classifier is In the illustrative example, the correctness of job classification is evaluated, and the results are summarized in Table 10. In this example, the correctness of the classification results is 94%.

2.4.3. Feeding Back the Estimation Error and Reclassification

Subsequently, the estimation error is fed back to the FCM classifier to adjust the classification results. The difference with Chen and Wang’s method [40] is that in the proposed methodology the BPNs of all categories are applied to estimate the cycle time of a job [41], and then the estimation errors arising from these BPNs all become additional inputs to the FCM, and jobs are reclassified. The new classification results are shown and compared with that before error feedback in Figure 8. After job reclassification, some outliers are assigned to the existing categories, and the overall estimation performance is improved in this way (see Table 11). The correctness of job classification is now 97%. Job reclassification continues until the improvement in the overall estimation performance or in the correctness of job classification becomes negligible.

3. Further Comparisons

To further evaluate the advantages and/or disadvantages of the proposed methodology, eight existing approaches, statistical analysis, CBR [20], BPN, SOM-WM [7], EFR [21], SOM-FBPN [17], the postclassifying FBPN [25], and the bidirectional classifying BPN approach [26] were all applied to the collected data. Three performance measures including MAE, MAPE, and the minimal RMSE were evaluated.

The proposed methodology was implemented on a PC with an Intel Dual CPU E2200 2.2 GHz and 2.0 G RAM. FCM was implemented with the Fuzzy toolbox of MATLAB 2006a. In addition, BPN was implemented with the Neural Network Toolbox under the following conditions.(1) Number of epochs per replication: 10000.(2) Number of initial conditions/replications: 10.(3) Stop training if MSE < 10−6 is satisfied or 10000 epochs have been run.

Among the steps, PCA and FCM can be done instantaneously. The training of BPN usually takes less than 1 minute per replication.

The performances with the nine approaches are compared and summarized in Table 12.

In statistical analysis, a linear regression equation is used to estimate the job cycle time. In the CBR approach, the weights of factors (the cycle times of the previous cases) are proportional to the similarities of the new job with the previous cases. The optimal value of parameter in the CBR approach was equal to the value that minimized the RMSE [8]. In the BPN approach, there was one hidden layer with 4~8 nodes, depending on the results of a preliminary analysis for establishing the best configuration. 3/4 of the collected data were used for training the BPN, while the remaining data were used for testing. In SOM-FBPN and SOM-WM, jobs were first classified with SOM. Subsequently, the examples of different categories were then learned with different FBPNs but with the same topology (or WM). In EFR, jobs are classified using fuzzy partition. In the post-classifying FBPN approach, a job was not pre-classified but rather post-classified after the estimation error has been generated. For this purpose, a BPN was used as the postclassification algorithm. In the bidirectional classifying approach, jobs are not only preclassified but also postclassified. The results of preclassification and postclassification are aggregated into a suitability index for each job. Each job is then assigned to the category to which its suitability index is the highest.

Statistical analysis was adopted as a comparison basis. According to experimental results, the following points are made.(1) The combination of BPN and PCA could reduce about 50% of space for storing the input variables in the modeling of the wafer fabrication system. (2)From the effectiveness viewpoint, the estimation accuracy (measured with the MAPE) of the proposed methodology was significantly better than those of the other approaches. The average advantage over statistical analysis is 80%.(3)The standard deviation of the cycle time for this case is 100 hours. Compared with this, the accuracy of the proposed methodology is good.(4)The estimation performance of the proposed methodology was also better than the existing classifying methods, such as SOM-WM, SOM-FBN, EFR, SOM-FBPN, the postclassifying FBPN, and the bidirectional classifying BPN approach. The advantage of the proposed methodology was reasonable due to the replacement of the variables and the iterative process of dealing with the outliers.(5)In general, the performances with the preclassifying approaches are better than that with the post-classifying approach.(6)The proposed methodology was also applied to other cases. The results are summarized in Table 13. Wilcoxon signed-rank test [42] was then used to make sure whether or not the differences between the performance of the proposed methodology and those of the eight existing approaches are significant. : When estimating the job cycle time the estimating performance of the proposed methodology is the same as that of the existing approach being compared.: When estimating the job cycle time, the estimating performance of the proposed methodology is better than that of the existing approach being compared.

The results are summarized in Table 14. The null hypothesis was rejected at , showing that the proposed methodology was superior to seven existing approaches in estimating the job cycle time.(7)To ascertain the effect of each treatment taken in the proposed methodology, the performances of BPN, FCM-BPN, PCA-BPN, and PCA-FCM-BPN (the proposed methodology) are compared in Table 15. Obviously, job classification (FCM) did contribute to the effectiveness of the proposed methodology, while the effect of variable replacement (PCA) was not obvious. The simultaneous application of the two treatments further improved the estimation accuracy for the testing data.

4. Conclusions and Directions for Future Research

Estimating the cycle time of each job in a wafer fabrication factory is a critical task to the wafer fabrication factory and has been widely studied in recent years. In order to further enhance the accuracy of the job cycle time estimation, PCA is applied to the FCM-BPN approach in this study, which is an innovative treatment in this field. Through replacing the variables, job classification can be more accurate. In addition, the relationship between the factors and the cycle time can be clearly specified.

On the other hand, since job classification is the core for the proposed methodology, a new index is used to validate the classification of jobs. The empirical relationship between the value and the estimation performance is also found. Finally, an iterative process is established to deal with the outliers to optimize the overall estimation performance.

An example is used to illustrate the proposed methodology. According to the experimental results,(1) the estimation accuracy (measured with MAE, MAPE, and RMSE) using the proposed methodology was significantly better than those with the existing approaches;(2) the advantage of PCA is for improving the correctness of job classification. The simple combination of PCA and BPN does not show much advantage;(3) after combining with PCA, the estimation accuracy of FCM-BPN was significantly improved;(4) the overall estimation performance is often affected by the outliers. The iterative procedure tries to remove the outliers and gradually improves the overall estimation performance.

Some other issues for this topic can be further investigated. Most of the existing methods are based on the job clustering. The aim of this study is to provide positive impacts on certain measures for these methods. However, if there are the other variable replacement techniques that can be as effective is also worth exploring in future studies. In addition, the iterative procedure used to optimize the results of job classification is quite time consuming especially for a large-scale problem, and therefore a more efficient way should be found.

Acknowledgment

This work was supported by the National Science Council of Taiwan.