Abstract

Owing to the complexity of the wafer fabrication, the due date assignment of each job presents a challenging problem to the production planning and scheduling people. To tackle this problem, an effective fuzzy-neural approach is proposed in this study to improve the performance of internal due date assignment in a wafer fabrication factory. Some innovative treatments are taken in the proposed methodology. First, principal component analysis (PCA) is applied to construct a series of linear combinations of the original variables to form a new variable, so that these new variables are unrelated to each other as much as possible, and the relationship among them can be reflected in a better way. In addition, the simultaneous application of PCA, fuzzy c-means (FCM), and back propagation network (BPN) further improved the estimation accuracy. Subsequently, the iterative upper bound reduction (IUBR) approach is proposed to determine the allowance that will be added to the estimated job cycle time. An applied case that uses data collected from a wafer fabrication factory illustrates this effective fuzzy-neural approach.

1. Introduction

Internal due date assignment is to quote an attractive but attainable due date for an arriving customer order. However, the completion time of an order is highly uncertain. It is therefore difficult to accurately forecast the completion time. For this reason, an allowance has to be added to the estimated completion time to reduce the risk [1].

Wafer fabrication is the most technologically complex step in semiconductor manufacturing, which exacerbates the difficulties of internal due date assignment [2]. In theory, this problem is NP-hard. That is why wafer fabrication is investigated in this study. Internal due date assignment in a wafer fabrication factory is difficult because of the following reasons.(1)Shop floor control in a wafer fabrication factory is a nontrivial task owing to the complexity of wafer fabrication. Some wafer fabrication processes are repeated processes. Thus, wafers need to visit a machine multiple times. An average job cycle time is several months with hundreds of hours of standard deviation. Many studies have shown that accurately predicting the cycle/completion times for such large systems is very difficult [1, 3, 4].(2)In addition, the completion time predicted using existing approaches is generally unbiased. This means that if the internal due date is set to be equal to the mean of the estimated completion time, then the probability of on-time delivery is only about 50% on average. To reduce the risk, an allowance or fudge factor has to be added to the estimated completion time [5]. The due date allowance factor is determined on the basis of the feedback information about the factory status at the time a job arrives at the factory. (3)Due date assignment, release control, and buffer control affect each other. Make-to-order wafer fabrication factories are confronted with both due date quotation and production scheduling problems at the same time [6]. If due date assignment and factory scheduling are processed separately by two systems, the overall performance is unlikely to be satisfactory because the two tasks are actually interrelated. Therefore, the interaction between due date assignment methods and scheduling rules in a wafer fabrication factory needs to be investigated.

To tackle these problems, some treatments have been carried out in the literature. First, various research works have been dedicated to estimate the cycle time using hybrid approaches. For example, Gupta and Sivakumar [7] presented look-ahead batch scheduling for the real-time control of due date objectives. Chen [8] proposed the look-ahead self-organization map (SOM)-fuzzy back propagation network (FBPN) approach for this purpose. A set of fuzzy inference rules were also developed to evaluate the achievability of a cycle time forecast. Subsequently, Chen et al. [1] added a selective allowance to the cycle time estimated using the look-ahead SOM-FBPN approach to determine the internal due date. Further, Chen [9] showed that the combination of SOM and FBPN could be improved by a minor adjustment of the classification results with the estimation error. Chen et al. [10, 11] proposed a postclassification fuzzy-neural approach in which a job was not preclassified but rather postclassified after estimating the cycle time. Experimental results showed that the postclassification approach was better than the preclassification approaches in some cases. To balance the influence of the preclassification results with that of the postclassifying results, Chen [12] proposed a bidirectional classifying approach, in which jobs are not only preclassified but also postclassified. Ankenman et al. [13] proposed a metamodeling approach, which integrates discrete-event simulation, adaptive statistical methods, and analytical queueing analysis to quantify the cycle time-throughput relationship. Chien et al. [4] used nonlinear regression equations and then related the forecasting error to some factory conditions and job attributes with a back propagation network (BPN) to improve the forecasting accuracy. The major disadvantage of statistical analysis is the lack of forecasting accuracy [8].

Second, in traditional due date setting rules, the fudge factor is usually equal to a multiple of the standard deviation of the predicted cycle time [14]. Recently, Chen et al. [1] proposed a selective allowance policy in which the allowance was only assigned to some preselected jobs. In this way, the sum of the allowances added to all jobs was controlled. However, even though the probability of on-time delivery in Chen et al.’s study was only 77% for the testing data, showing that improving the probability of on-time delivery while controlling the fudge factor is a real challenge. In addition, the allowances that were assigned to the chosen jobs in this study were equal, leaving room for improvement. Another way of taking this issue into account is to construct a confidence interval containing the actual completion time [3]. The upper confidence limit sets the internal due date. However, the probability of a job delivered on time is only 99.7% for the testing data, under the assumption that residuals follow a normal distribution. From another point of view, Chen and Wang [15] incorporated the fuzzy c-means (FCM)-BPN approach with a nonlinear programming (NLP) model to construct the inclusion interval of the predicted completion time. Similarly, the upper inclusion limit sets the internal due date. An inclusion interval is narrower than a confidence interval, and the probability of a job delivered on time is 100%, at least for the training data. Chen and Lin [16] modified this approach by gathering a group of experts in related fields to set the due date in a collaborative way. Fuzzy intersection is applied to combine the due dates into a representative value.

The existing approaches have the following problems.(1)Some factors used to forecast the job cycle time are dependent on each other, which may cause problems in classifying jobs and in fitting the relationship between the job cycle time and these factors.(2)In Chen and Wang [15] and Chen and Lin [16], NLP models are solved to determine the upper bound of the job cycle time. However, the NLP models involve complicated constraints and therefore are difficult to solve. The NLP models will become too huge if many jobs are to be considered.

To tackle these problems, an effective fuzzy-neural approach is proposed in this study to improve the performance of internal due date assignment in a wafer fabrication factory. The literature provides probabilistic (stochastic) and fuzzy methods that can consider the uncertainty or randomness in the completion time. However, the longest average cycle time exceeds three months with a variation of more than 300 hours. Fitting the cycle time within a future month with a distribution function is not easy, implying that a stochastic approach might not be applicable. That is why a fuzzy approach is proposed in this study.

The effective fuzzy-neural approach has the following innovative characteristics.(1)Variable replacement using principal component analysis (PCA): PCA uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables to reflect information in a better way.(2)Updating the upper bound of the job cycle time using the iterative upper bound reduction (IUBR) approach: the IUBR approach is proposed to determine the upper bound of the completion time forecast. A tight upper bound means that the allowance assigned to a job is minimized.

Some recent works in this field are relevant. The differences between the proposed methodology and these methods are summarized in Table 1.

The remainder of this paper is organized as follows. Section 2 introduces the proposed methodology which is composed of four steps. A practical example is used to validate the effectiveness of the proposed methodology. The performance of the proposed methodology is evaluated and compared with those of some existing approaches. Finally, the concluding remarks and some directions for future research are given in Section 4.

2. Methodology

The operating procedure of the effective fuzzy-neural approach consists of several steps that will be described in the following sections.

Step 1. Forming new variables by constructing linear combinations of the original variables using PCA.

Step 2. Classifying jobs using fuzzy c-means (FCM).

Step 3. Forecasting the cycle times of jobs in each category using a BPN.

Step 4. Determining the upper bound of the cycle time using the IUBR approach.

A flow chart of the proposed methodology is shown in Figure 1.

2.1. Step 1: Forming New Variables Using PCA

First, PCA is used to replace the inputs to the BPN. PCA was invented by Pearson [18] as an analogue of the principal axes theorem in mechanics; it was later independently developed by Hotelling [19]. In the literature, there are more advanced applications of PCA. For example, Jaiswal et al. [20] used a hybrid of PCA and partial least squares for face recognition. In Mohtasham et al. [21], linear and exponential weighted PCA techniques based on spectral similarity were employed to predict the dye concentration in coloured fabrics. The operating procedure of PCA consists of several steps that are illustrated in Figure 2.

The references on the combination of PCA, FCM, and BPN are still very limited [17, 22, 23].

2.2. Step 2: Classifying Jobs Using FCM

After employing PCA, examples are then classified using FCM. FCM is one of the most popular fuzzy clustering techniques because it is efficient, straightforward, and easy to implement. However, FCM is sensitive to initialization and is easily trapped in local optima.

The objective function of FCM is to minimize the weighted sum of squared distances such that the jobs in a category will be similar (or related) to one another and different from (or unrelated to) the jobs in other categories. In FCM, the Euclidean distance between two jobs is measured: where The weight of a job is a function of its membership: However, FCM requires prior knowledge about the number of clusters in the data, which may not be known for new data. Then, fuzzy clustering is carried out through an iterative optimization of the objective function (see Figure 3). The clustering process stops when the maximum number of iterations is reached or the improvement in the objective function becomes negligible with more iterations. In addition, the S-index proposed by Xie and Beni [24] is used to give the ideal number of categories automatically: where Chen and Wang [23] found the empirical relationship between the S-index and the estimation performance.

2.3. Step 3: Forecasting the Cycle Times of Jobs in Each Category with a BPN

Subsequently, the jobs/examples of a category are learned with the same BPN. BPN is a popular tool with applications in a variety of fields. Nevertheless, different problems may require different parameter settings for a given network architecture. In the literature, researchers have used BPNs for estimating cycle times and assigning due dates. The configuration of the BPN is established as follows.(1)Inputs: the new factors determined by PCA associated with the th example/job. These factors have to be partially normalized so that their values fall within [] [10, 11]. (2)Single hidden layer: generally one or two hidden layers are more beneficial for the convergence property of the BPN [25].(3)The number of neurons in the hidden layer: 1 to . An increase in the number of hidden-layer nodes lessens the output errors for the training examples but increases the errors for novel examples. Such a phenomenon is often called “overfitting”. There exist many different approaches such as the pruning algorithm, the polynomial time algorithm, the canonical decomposition technique, and the network information criterion for finding the optimal configuration of a BPN [26]. In addition, there has been some research considering the relation among the complexity of a BPN, the performance for the training data, and the number of examples, for example, using Akaike’s information criterion (AIC) [27] or the minimum description length (MDL) [28]. (4)Activation/transformation function: there are a number of common activation/transformation functions, such as identity function, binary step function, bipolar step function, sigmoid functions (binary sigmoid function and bipolar sigmoid function), and ramp function. In the proposed methodology, the binary sigmoid function is used: Therefore, the output ranges between 0 and 1.(5)Output (): the (normalized) cycle time forecast of the example. is compared with the normalized cycle time , for which root mean squared error (RMSE) is calculated: is derived by transforming the signal transferred to the output layer: where Similarly, is derived by transforming the signal transferred to the hidden layer: where

Some algorithms are applicable for training a BPN in the backward phase, such as the gradient descent algorithms, the conjugate gradient algorithms, and the Levenberg-Marquardt algorithm. In this study, the Levenberg-Marquardt algorithm is applied. The Levenberg-Marquardt is the most widely used optimization algorithm. It outperforms simple gradient descent and other conjugate gradient methods in a wide variety of problems. The Levenberg-Marquardt algorithm uses approximation and updates the network parameters in a Newton-like way, as described below.

The network parameters are placed in vector . The network output can be represented with . The objective function of the BPN is to minimize RMSE or equivalently the sum of squared error (SSE):

The Levenberg-Marquardt algorithm is an iterative procedure. In the beginning, the user should specify the initial values of the network parameters. In each step, the parameter vector is replaced by a new estimate, and the network output by its linearization. When the network converges, the gradient of the objective function will be zero. It should be noted that while the Levenberg-Marquardt method is in no way optimal but is just a heuristic, it works extremely well in practice.

2.4. Step 4: Establishing the Upper Bound for the Job Cycle Time Using the IUBR Approach

In order to apply the BPN obtained at the previous step to determine the internal due date of a job, the parameter values in the BPN must be adjusted. To this end, in Chen and Wang [15] and Chen and Lin [16], the NLP model is constructed to adjust the connection weights and thresholds in the BPN, which is not easy to solve. In the IUBR approach, only the threshold of the output node will be adjusted in an iterative way. This way is much simpler and can also achieve satisfactory results.

Substituting (9) into (8), Therefore, So

Assume that the adjustment made to the threshold of the output node is indicated as . After adjustment, the output from the new BPN, , determines the upper bound of the cycle time: where Substituting (17) into (16), Substituting (15) into (18), Obviously, the maximum of establishes the lowest upper bound.

Since is the upper bound of the cycle time, , Equation (21) holds for all jobs, so According to (19), the optimal value of should be set to the maximum possible value:

Then the optimization results of the BPN are sensitive to the initial conditions and may be different for each iteration. Assume that the optimal value of in the th iteration is indicated with . After some iterations, In this way, the upper bound of the cycle time is decreased gradually (see Figure 4). Another merit of the IUBR approach is that it does not rely on the parameters of the BPN.

2.5. Ensemble Learning

Ensemble learning is based on the notion of perturbing and combining. An ensemble consists of a collection of ANNs and combines their predictions to obtain a final prediction. In FCM, a job can be classified into several categories to different degrees. In theory, the BPNs of all categories can be applied to predict the cycle time of a job. The forecasts obtained by using the BPNs may not be the same and need to be aggregated. To this end, some treatments have been carried out in the literature.(1)Linear aggregation [29]: where . is the cycle time of job estimated by the BPN of category .(2)BPN aggregation [29]: the membership and cycle time forecast of a job are fed into another BPN to be aggregated. Consider (3)Generalized average method [30]: in FCM, the error is proportional to the distance to the center. For this reason, a natural way to aggregate the forecasts is

3. Application and Analyses

To demonstrate the application of the proposed methodology, a real case with the data of 40 jobs from a wafer fabrication factory located in Taichung City Scientific Park, Taiwan (see Table 2), was used, where stand for the job size, factory utilization, the queue length on the route, the queue length before the bottleneck, the work in progress (WIP), and the average waiting time. The wafer fabrication factory produces more than ten products and has a monthly capacity of 20,000 wafers. The wafer fabrication processes include photolithography, thermal processes, implantation, chemical vapor deposition, etching, physical vapor deposition, chemical mechanical polishing, process diagnostics and control, and cleaning. The production characteristic of “reentry,” which is highly relevant to the semiconductor industry, is clearly reflected in this problem. It also shows the difficulties facing production planners and schedulers who attempt to provide an accurate due date for a product with a very complicated routing.

The standard deviations of the six inputs are compared in Figure 5. Note that the variability in , , and is substantially higher than that in the remaining variables.

Subsequently, we standardize the data (see Table 3) and obtain the correlation matrix as The eigenvalues and eigenvectors of are then calculated. Based on them, the variance contribution rates can be derived as Summing up ’s, we obtain After conducting a Pareto analysis, is chosen as 3 to meet the requirement . The first three principal components explain roughly 80% of the total variability in the standardized data, so that it might be a reasonable way to reduce the dimensions in order to visualize the data.

Subsequently, the component scores are calculated (see Table 4), which contain the coordinates of the original data in the new coordinate system defined by the principal components, and will be used as the new inputs to the FCM-BPN.

Subsequently, jobs are classified using FCM based on the new variables. The results of the -test are summarized in Table 5. In this case, the optimal number of job categories was 5. However, there will be some categories with very few jobs. For this reason, the second best solution is used, that is, 4 categories, by setting the threshold of membership to 0.3. The classification results are shown in Table 6.

After preclassification, the three-layer BPN of each category was applied to predict the cycle times of jobs belonging to the category according to the new variables. Different network architectures were evaluated to compare the forecasting performance. The best-fitted network which was selected, and, therefore, the architecture which presented the best forecasting accuracy, is composed of three inputs, six hidden and one output neurons.

The convergence condition in training networks was established as either the improvement in MSE becomes less than with one more epoch or 1000 epochs have already been run. 3/4 of the adopted examples in each category are fed as “training examples” into the BPN. The remaining 1/4 is left for testing. For example, category 3 has 8 jobs; 6 of them are randomly chosen for training the BPN while the remaining 2 jobs are left for testing. The forecasting accuracy can be evaluated with mean absolute error (MAE), mean absolute percentage error (MAPE), and RMSE. The forecasting performances are summarized in Table 7. The forecasting results are shown in Figure 6. The performance of the proposed methodology is compared with those of statistical analysis (i.e., multiple linear regression), BPN, FCM-BPN, and PCA-BPN in Table 8. The nonlinear nature of this problem is obvious since the performance of statistical analysis (a linear approach) is poor. In addition, the simple combination of PCA and BPN does not have much effect. The main effect of PCA is to improve the correctness of job classification, as mentioned in Chen and Wang [23].

Subsequently, the IUBR approach is applied to determine the upper bound of the cycle time. In the first iteration, is , and the upper bounds of the cycle times are shown in Figure 7.

The process stops after five iterations because the upper bounds remain unchanged after the fifth iteration. The results of the five iterations are summarized in Table 9, from which the allowances which are 25, 33, 54, 48, 56, 57, 58, 44, 54, 53, 42, 46, 55, 48, 47, 50, 48, 41, 36, 39, 30, 41, 31, 24, 34, 29, 27, 13, 37, 24, 47, 15, 34, 50, 44, 49, 53, 50, 53, and 53 added to the cycle times are derived with an average of 42 (hours). The due date of a job is then set to the release time plus the upper bound of the cycle time.

To make a comparison, six other allowance determination policies are also applied to the collected data.(1)Total work content policy (TWK): in TWK, the due date allowance factor is estimated based on historical data by a regression model. There is another product in the wafer fabrication factory with an average cycle time of 1278 hours. The total processing time and cycle time standard deviation of the product are 317 and 87 hours, respectively. The product was adopted as the comparison basis, and in this case the cycle time forecast and allowance are determined as follows: (2)Gamma distribution fitting method (Gamma): the waiting time of a job is fitted with a Gamma distribution. For example, the waiting time of a job with 24 pieces of wafers is fitted with a Gamma distribution in Figure 8. The 50% and 95% percentiles are 929 and 1160, respectively, and the total processing time is 251 hours. So the cycle time forecast is 1160 + 251 1411 hours, and allowance is hours.(3)Constant allowance policy (CON, PCA-FCM-BPN + CON): add three times the RMSE of the prediction approach to the completion time forecasts to determine the due date.(4)Selective allowance policy (SAP, PCA-FCM-BPN + SAP): add three times the RMSE of the prediction approach to the completion time forecasts of a small quantity of jobs that might encounter difficulties in keeping the internal due date. Such jobs are chosen in the following way: In other words, these jobs are among the 50% percentiles.(5)Random assignment policy (RAP, PCA-FCM-BPN + RAP): add the extra allowance to the completion time forecasts of the same quantity of jobs that are randomly chosen.(6)No allowance policy (NAP, PCA-FCM-BPN + NAP): no allowance will be assigned to any job.

Due date related performances are impacted by the quality of the due date assignment methods. After applying the seven allowance determination policies, the following performance measures are compared:(1)number of tardy jobs ();(2)mean tardiness ();(3)sum of allowances.

The comparison results are summarized in Table 10. The proposed IUBR approach outperforms the other allowance determination policies.(1)It guarantees the on-time delivery of the jobs. Both and are zeros. Among the other allowance determination policies, only Gamma and CON can achieve that at the expense of adding some extra allowance.(2)The percentage of reduction in the sum of allowances over CON is 52%. The advantages over TWK, Gamma, SAP, and RAP are 79%, 74%, 12%, and 12%, respectively. The percentage of on-time delivery is not derived from a greater buffer on the completion time prediction.(3)The performance of SAP is not better than that of RAP, which shows it is not easy to anticipate jobs that may delay.(4)Compared with TWK and Gamma, the other policies effectively reduce the allowances added to the job cycle times, which is due to the forecasting accuracy of the PCA-FCM-BPN approach.

4. Conclusions and Directions for Future Research

Owing to the complexity of the wafer fabrication, the due date assignment of each job presents a challenging problem to the production planning and scheduling people. The firm has to offer a price reduction if the due date is far away from the expected one. Conversely, the looser the due date is set, the higher the probability that the job will be completed or delivered on time is. That is very important to maintain a good reputation with the customers. This study explores a new application of fuzzy-neural approaches in the due date assignment problem of the wafer fabrication factory. The proposed methodology decomposes internal due date assignment in a wafer fabrication factory into two subproblems: completion time prediction and allowance determination. To overcome the problems with the existing approaches, two innovative treatments are taken in the proposed methodology. First, PCA is applied to construct a series of linear combinations of the original variables to form a new variable, so that these new variables are unrelated to each other as much as possible, and the relationship among them can be reflected in a better way. The combination of PCA and BPN also reduces the space for storing the input variables in the modeling of the wafer fabrication system. In addition, the simultaneous application of PCA, FCM, and BPN further improved the estimation accuracy. Subsequently, the IUBR approach is proposed to determine the allowance that will be added to the estimated job cycle time. Our result is existentially tight.

The validity that the effective fuzzy-neural approach for internal due date assignment is able to improve on-time delivery has been proved by the case study. Based on the above analysis,(1)the forecasting accuracy (measured with MAE, MAPE, and RMSE) of the PCA-FCM-BPN was significantly better than those of many existing approaches;(2)it is easier to determine the allowance in the IUBR method than the method based on NLP;(3)the bound on the job cycle time is tighter than the bounds by TWK, Gamma, and CON and simpler than the bound by Chen and Wang [15], which requires NLP optimization.

However, there are two limitations that need to be acknowledged and addressed regarding the present study.(1)The first limitation concerns the experimental nature of this research. The proposed methodology was studied within a short period of time. There is an apparent danger involved whenever conclusions are drawn from such a limited sample and then applied in the highly dynamic semiconductor manufacturing environment.(2)The BPN part in the methodology is usually regarded as a black box. To exploit the knowledge embedded in the back box, and to facilitate the practical application of the proposed methodology, some association rules have to be extracted from the estimation results.

The IUBR approach only modifies the threshold of the output node. In future studies, other parameters in the BPN can be modified in similar ways. However, it is a challenge to make the modification results independent of the original parameter values. In addition, the concept of customer satisfaction can be incorporated into the proposed methodology; thereby, the due date can achieve a higher level of customer satisfaction. In contrast, the proposed methodology only guarantees a positive level of customer satisfaction.

Acknowledgment

This study was financially supported by the National Science Council of Taiwan.