Abstract

This study proposes a slack-diversifying fuzzy-neural rule to improve job dispatching in a wafer fabrication factory. Several soft computing techniques, including fuzzy classification and artificial neural network prediction, have been applied in the proposed methodology. A highly effective fuzzy-neural approach is applied to estimate the remaining cycle time of a job. This research presents empirical evidence of the relationship between the estimation accuracy and the scheduling performance. Because dynamic maximization of the standard deviation of schedule slack has been shown to improve performance, this work applies such maximization to a slack-diversifying fuzzy-neural rule derived from a two-factor tailored nonlinear fluctuation smoothing rule for mean cycle time (2f-TNFSMCT). The effectiveness of the proposed rule was checked with a simulated case, which provided evidence of the rule’s effectiveness. The findings in this research point to several directions that can be exploited in the future.

1. Introduction

Wafer fabrication is one of the most important steps in semiconductor manufacturing [1]. Wafer fabrication factories exist all over the world; many are in Taiwan. The production operations of a wafer fabrication factory are very expensive, and the factory must be fully utilized to stay in business. If the factory is to remain in operation, its capacity must not substantially exceed the demand. Factory managers must plan the use of the existing capacity to shorten the cycle time. To maximize the product turnover rate is an important goal. In this regard, scheduling is undoubtedly a very useful tool. Kim et al. [2] simultaneously considered three issues: release control, mask scheduling, and batch scheduling. However, Chen et al. [35] noted that job dispatching is very difficult task in a wafer fabrication factory. Traditionally, a scheduling problem is formulated as a mathematical programming problem. The optimal solution of the mathematical programming problem gives the optimal scheduling of the manufacturing system. However, the mathematical programming problem of scheduling a wafer fabrication factory is large and effectively intractable. In practice, many wafer fabrication factories suffer from lengthy cycle times and cannot speed up their deliveries to their customers.

Semiconductor manufacturing can be divided into four stages: wafer fabrication, wafer probing, packaging, and final testing. The most important and most time-consuming stage is wafer fabrication. This study investigates job dispatching for this stage.

An effective scheduling and dispatching algorithm is an urgent necessity for wafer fabrication. This field includes many different methods, including dispatching rules, heuristics, data mining-based approaches [6, 7], agent technologies [6, 810], and simulation. The prevalent methods for practical applications include dispatching rules (e.g., first-in first out (FIFO), earliest due date (EDD), least slack (LS), shortest processing time (SPT), shortest remaining processing time (SRPT), critical ratio (CR), the fluctuation smoothing rule for the mean cycle time (FSMCT), the fluctuation smoothing rule for cycle time variation (FSVCT), FIFO+, SRPT+, and SRPT++), all of which have received a lot of attention over the last few years [68]. For details on traditional dispatching rules, please refer to Lu et al. [11].

Recently, Chen proposed the one-factor tailored nonlinear fluctuation smoothing rule for mean cycle time (1f-TNFSMCT). This rule contains an adjustable parameter that allows it to be customized for a target wafer fabrication factory. Chen et al. [12] proposed the two-factor tailored nonlinear fluctuation smoothing rule for mean cycle time (2f-TNFSMCT), which outperformed four existing rules in scheduling a wafer fabrication factory.

Magnifying the difference in the slack seems to improve scheduling performance, especially with respect to the average cycle time. To exploit this advantage, Wang et al. [13] derived the slack-diversifying nonlinear fluctuation smoothing rule by diversifying the slack in the 1f-TNFSVCT rule. To extend this advantage and to enhance the scheduling of wafer fabrication factories, a slack-diversifying fuzzy-neural rule is proposed in this study. The objective function includes the average and standard deviation of the cycle time, and therefore the problem can be denoted by , , and .

The proposed methodology applies several soft computing techniques, including fuzzy classification and artificial neural network prediction. First, the remaining cycle time of a job needs to be estimated by the slack-diversifying fuzzy-neural rule. In this research, an innovative and highly effective fuzzy-neural method estimates the remaining cycle time of a job. The fuzzy-neural approach is based on the fuzzy c-means and back propagation network (FCM-BPN) approach [14]. According to Chen and Wang [4], improvements to the accuracy of remaining cycle time estimation can significantly improve the scheduling performance of a fluctuation smoothing rule. In the original study, Chen and Wang used a time-consuming and not very accurate gradient search algorithm to train the BPN. In this study, we use the Levenberg-Marquardt algorithm [15] to achieve the same purpose; it is more efficient and more accurate than the algorithm in Chen and Wang’s study. In addition, we also found some empirical evidence regarding the relationship between the estimation accuracy and the scheduling performance.

The slack-diversifying nonlinear fluctuation smoothing rule is modified from 2f-TNFSMCT [12]; the new rule maximizes the difference in the slack, as measured by the standard deviation of the slack. Slack is a fuzzy concept, and in this study it is defined in a way that is conducive to scheduling performance. The factor values for achieving this must be determined, and that calculation turns out to be a complex optimization problem. We applied a polynomial fitting technique to convert it into a more tractable form for which several optimal solutions can be found. After screening some values out of the specified range, we used the remaining values to construct an optimized 2f-TNFSMCT rule. It is possible that some jobs have very large or small slacks, which will distort the optimization results. For this reason, such jobs are excluded. Further, the values of parameters influence the range of the slack. For a fair comparison, the range of the slack should be considered to determine the optimal parameter values. The differences between the proposed methodology and Wang et al.’s method are summarized in Table 1.

This paper is arranged as follows. Section 2 reviews the existing approaches to scheduling a wafer fabrication factory. Section 3 provides the details of the proposed methodology. In Section 4, a simulated case is used to validate the effectiveness of the slack-diversifying fuzzy-neural rule. The performance levels of some of the existing rules in this field are also tested with the simulated data. Finally, we draw our conclusions in Section 5 and discuss some worthwhile topics for future work.

Some earlier work in this field is relevant. Mönch et al. [16] classified the scheduling problems in a semiconductor manufacturing factory into six categories: batch scheduling problems, parallel machine scheduling problems, job shop scheduling problems, scheduling problems with auxiliary resources, multiple orders per job scheduling problems, and scheduling problems related to cluster tools. In Yao et al. [17], a decentralised multiobjective scheduling methodology was presented for semiconductor manufacturing, in which global objectives were decentralised into local ones of work stations. Lee et al. [18] adopted the Petri nets to accurately model the semiconductor manufacturing activities. Through representing the token movements in a Petri net with the well-established scheduling model for batch chemical processes, the optimal schedule of the given semiconductor process could be determined accordingly. Yugma et al. [19] proposed an efficient heuristic algorithm based on iterative sampling and simulated annealing for solving a complex batching and scheduling problem in a diffusion area of a semiconductor plant. Altendorfer et al. [20] proposed the work in parallel queue (WIPQ) rule to maximize throughput with a low level of work in process (WIP). Zhang et al. [21] proposed the dynamic bottleneck detection (DBD) approach that classifies workstations into several categories and then applies different dispatching rules to these categories. They used three dispatching rules including FIFO, the shortest processing time until the next bottleneck (SPNB), and CR. In view of the uncertainty in the classification of workstations, Chen [22] proposed the fuzzy DBD approach.

Considering the current conditions in a wafer fabrication factory, Hsieh et al. [7] chose one approach from FSMCT, FSVCT, largest deviation first (LDF), one step ahead (OSA), or FIFO. Chen [23] modified FSMCT and proposed the nonlinear FSMCT (NFSMCT) rule, in which he smoothed the fluctuation in the estimated remaining cycle time and balanced it with that of the release time or the mean release rate. To diversify the slack, he applied the “division” operator. This was followed by Chen [24], in which he proposed the one-factor tailored NFSMCT (1f-TNFSMCT) rule and the one-factor tailored nonlinear FSVCT (1f-TNFSVCT) rule. Both rules contain adjustable parameters that allow them to be customized for a target wafer fabrication factory. Dabbas and Fowler [25] and Dabbas et al. [26] combined some dispatching rules into a single rule by forming their linear combination with relative weights. However, that research lacked a systematic procedure to determine the weights of those rules. In a multiple-objective study, Chen and Wang [27] proposed a biobjective nonlinear fluctuation smoothing rule with an adjustable factor (1f-biNFS) to optimize both the average cycle time and the cycle time variation at the same time. More degrees of freedom seem to enhance the performance of customizable rules. For this reason, Chen et al. [14] extended 1f-biNFS to a bi-objective fluctuation smoothing rule with four adjustable factors (4f-biNFS). One drawback of these rules is that only static factors are used, and they must be determined in advance. To this end, most studies (e.g., [14, 23, 24, 27]) performed extensive simulations. This is not only time-consuming, but it also fails to consider a sufficient number of factor combinations.

Chen [28] established a mechanism that was able to adjust the values of the factor in 1f-biNFS dynamically (dynamic 1f-biNFS). However, even though satisfactory results were obtained in his experiment, there was no theoretical basis supporting the proposed mechanism. Chen [29] attempted to relate the scheduling performance to the factor values using a back propagation network (BPN). If that had worked, then the factor values contributing to the optimal scheduling performance could have been found. However, the explanatory ability of the BPN was not sufficient.

At the same time, Chen [28] stated that a nonlinear fluctuation smoothing rule uses the divisor operator instead of the subtraction operator, which diversifies the slack and makes the nonlinear fluctuation smoothing rule more responsive to changes in the parameters. Chen and Wang [27] proved that the effects of the parameters are balanced better in a nonlinear fluctuation smoothing rule than in a traditional one if the variation in the parameters is large.

3. Methodology

The variables and parameters that will be used in the proposed methodology are defined as follows:(1): the release time of job, ;(2): the estimated remaining cycle time of job from step ;(3): the slack of job at step ;(4): the mean release rate;(5): inputs to the three-layer BPN of job , ;(6): the output from hidden-layer node , ;(7): the connection weight between hidden-layer node and the output node;(8): the connection weight between input node and hidden-layer node , ; ;(9): the threshold on hidden-layer node ;(10): the threshold on the output node.

The proposed methodology includes the following six steps.

Step 1. Normalize the collected data [30].

Step 2. Use FCM to classify jobs. The required inputs for this step are job attributes. To determine the optimal number of categories, we use the test [31]. The output of this step is the category of each job.

Step 3. Use the BPN approach to estimate the remaining cycle time of each job. Jobs of different categories will be sent to different three-layer BPNs. The inputs to the three-layer BPN include the attributes of a job, while the output is the estimated remaining cycle time of the job.

Step 4. Derive the 2f-TNFSMCT rule.

Step 5. Diversify the slack in the 2f-TNFSMCT rule.

Step 6. Incorporate the estimated remaining cycle time into the new rule.

The flowchart of the proposed methodology is shown in Figure 1.

Table 2 is used to compare the proposed methodology with the existing methods.

The remaining cycle time of a job being produced in a wafer fabrication factory is the time still needed to complete the job (see Figure 2). If the job has just been released into the wafer fabrication factory, then the remaining cycle time of the job is its cycle time [3238]. Tai et al. [32] provided a statistical approach to calculate the cycle time for multilayer semiconductor final testing involving the sum of multiple Weibull-distributed waiting times. The remaining cycle time is an important attribute (or performance measure) for the WIP in the wafer fabrication factory. We need to estimate the remaining cycle time for each job because the remaining cycle time is an important input to the scheduling rule. Past studies (e.g., [12]) have shown that the accuracy of remaining cycle time estimation can be improved by job classification. Soft computing methods (e.g., [4, 12, 37, 38]) have received much attention in this regard.

3.1. Step 1: Normalize the Collected Data

First, in order to facilitate the subsequent calculations and problem solving, all collected data are normalized into [38] as follows: where is the normalized value of and and indicate the lower and upper bounds of the range of the normalized value, respectively. min and max are the minimum and maximum of , respectively. The formula can be written as

if the unnormalized value is to be obtained. Then, we place the (normalized) attributes of a job in vector . To illustrate the proposed methodology, a real case containing the data of 35 jobs was used. For each job, twelve attributes were collected from the reports and databases of a production management information system (PROMIS). After a backward elimination by regression analysis, six attributes that were the most influential to the job cycle time were chosen (see Table 3).

The results of partial normalization are shown in Table 4.

3.2. Step 2: Classify Jobs Using FCM

In the proposed methodology, jobs are classified into categories using FCM. If a crisp clustering methods were applied, then it would be possible that some clusters would have very few examples. By contrast, in a fuzzy clustering method, an example belongs to multiple clusters to different degrees, which provides a solution to this problem. Similarly, in probability theory the naïve Bayes method provides the probability that a given item belongs to each class. However, the application of FCM can consider subjective issues in job classification.

FCM classifies jobs by minimizing the following objective function: where is the required number of categories; is the number of jobs; indicates the degree of membership with which job belongs to category ; measures the distance from job to the centroid of category ; is a parameter to adjust the fuzziness and is usually set to 2. The procedure of FCM is as follows:(1)produce a preliminary clustering result;(2)(for some number of iterations) calculate the centroid of each category as where is the centroid of category . is the degree of membership that job holds in category after the th iteration;(3)remeasure the distance from each job to the centroid of each category and then recalculate the corresponding membership;(4)stop if the following condition is met. Otherwise, return to step as follows: where is a real number representing the threshold for the convergence of membership.Finally, the separate distance test ( test) proposed by Xie and Beni [31] can be applied to determine the optimal number of categories as follows:

The Fuzzy Logic Toolbox of MATLAB is used to implement the FCM approach. The FCM program code for the illustrative example is shown in Algorithm 1. The value can also be obtained using this program.

A = [0.300  0.485  0.900  0.367  0.367  0.162; …;  0.900  0.396  0.900  0.9000.633  0.900]
K = 4
center,  U,  obj_fun = fcm(A,  K);
Jm  =  min(obj_fun)
e2_min  =  9999;
for  i  =  1:  K
for  j  =  i  +  1 : K
  e2_sum  =  0;
  for  p  =  1:  3
   e2_sum  =  e2_sum + (center(i,  p)   −  center(j,  p))2;
  end
  if  e2_sum  <  e2_min
   e2_min  =  e2_sum;
  end
end
end
e2_min
S  =  min(Jm)/(35*e2_min)

The results of the test are summarized in Table 5. In this case, the optimal number of job categories was 3. The threshold for the convergence of membership () was set to 0.01. A common practice is to set a threshold of membership to determine whether a job belongs to each category. For example, the classification results for the assumption that are shown in Table 6. With each decrease in the threshold, each category will contain more jobs. For example, the classification results for the assumption that are shown in Table 7. This can solve the problem of an insufficient number of examples. In addition, a job may be classified under more than one category, which makes FCM different from crisp classification methods. The disadvantage of this approach is that some jobs do not belong to any category, that is, the outliers. Another way is to assign a job to the category for which it has the highest degree of membership. The disadvantage of this approach is that some categories will have very few jobs.

The application of FCM has the following problems:(1)how to determine the membership threshold . In theory, setting to 1/ ensures that each job can be classified under some category with absolute certainty. Conversely, if is set to a large value, it is possible that some jobs cannot be classified under any category with absolute certainty;(2)whether to consider the degree of membership in the remaining cycle time estimation. If two jobs belong to the same category but have different degrees of membership, how can one guarantee that the BPN of this category has the same predictive power for the two jobs? To address this issue, the degree of membership can be considered when estimating the remaining cycle time with the BPN.

3.3. Step 3: Estimate the Remaining Cycle Time with a BPN

After clustering, for each category, a three-layer BPN is built to estimate the remaining cycle times of jobs in the category. Some of the jobs in the category are input as “training examples” to the BPN to determine the parameter values. The configuration of the three-layer BPN is as follows. First, the inputs are the parameters associated with the th job. These parameters have to be normalized before they are fed into the three-layer BPN. Subsequently, there is only a single hidden layer with neurons that are twice as many as the input layer. Finally, the output from the three-layer BPN is the (normalized) remaining cycle time estimate () of the example. The activation function used in each layer is the Sigmoid function,

The procedure for determining the parameter values is now described. Two phases are involved at the training stage. At first, in the forward phase, inputs are multiplied with weights, summed, and transferred to the hidden layer. Then activated signals are output from the hidden layer as where where values are also transferred to the output layer with the same procedure. Finally, the output of the BPN is generated as where Some algorithms are applicable for training a BPN in the backward phase, such as gradient descent algorithms, conjugate gradient algorithms, the Levenberg-Marquardt algorithm, and others. In this study, the Levenberg-Marquardt algorithm is applied. The Levenberg-Marquardt algorithm was designed for training with second-order speed without having to compute the Hessian matrix. It uses approximation and updates the network parameters in a Newton-like way, as described in the following.

The network parameters are placed in vector . The network output can be represented with . The objective function of the BPN is to minimize RMSE or equivalently the sum of squared error (SSE) as follows:

The Levenberg-Marquardt algorithm is an iterative procedure. At the beginning, the user should specify the initial values of the network parameters . It is common practice to set . At each step, the parameter vector is replaced by a new estimate , where . The network output becomes ; it is approximated by its linearization as where is the gradient vector of with respect to . Substituting (13) into (12), When the network reaches the optimal solution, the gradient of with respect to will be zero. Taking the derivative of with respect to and setting the result to zero give where is the Jacobian matrix containing the first derivative of network error with respect to the weights and biases. Equation (16) includes a set of linear equations that can be solved for .

Finally, the three-layer BPN can be applied to estimate the remaining cycle time of a job. When a new job is released into the factory, the parameters associated with the new job are recorded. Then the new job is classified into a category, and the three-layer BPN of the category can be applied to estimate the remaining cycle time of the new job.

Consider category 1 of the previous example. When , there are 16 jobs in this category. These jobs are split into two parts: the training data (the first 12 jobs) and the testing data (the remaining jobs). A three-layer BPN estimates the remaining cycle time of jobs in this category according to their six attributes with the following settings: single hidden layer,the number of neurons in the hidden layer: 12,convergence criterion: mean squared error (MSE) .

The Neural Network Toolbox of MATLAB is used to implement the BPN approach. The BPN program code for the illustrative example is shown in Algorithm 2. The estimation results are shown in Figure 3.

tn_input  =    [0.500  0.300  0.500…;  0.396  0.574  0.811…;  0.900  0.900  0.900…;  0.722  
  0.811  0.900…;  0.700  0.567  0.633…;  0.279  0.314  0.314
tn_target  =  [0.341  0.383  0.5980.100]
net    =    newff([0  1;  0  1;  0  1;  0  1;  0  1;  0  1], 12,  1 ,{“logsig”,  “logsig”},  “trainlm”);
net    =    init(net);
net.trainParam.show  =  10;
net.trainParam.lr  =  0.1;
net.trainParam.epochs  =  1000;
net.trainParam.goal  =  1e  −  4;
net,  tr]  =  train(net,  tn_input,  tn_target);
tn_output  =  sim(net,  tn_input)
te_input  =  [0.300  0.500  0.300…;  0.307  0.396  0.870…;  0.900  0.900  0.900…;  
0.722  0.811  0.900…;  0.767  0.700  0.900…;  0.158  0.264  0.302
te_output  =  sim(net,  te_input)

The following indexes evaluate the estimation accuracy of both training and testing data: mean absolute error (MAE) = 43 (hrs),mean absolute percentage error (MAPE) = 3.7%,RMSE = 62 (hrs).In contrast, if the remaining cycle times of these jobs are predicted in association with those of the other unclassified jobs, then the estimation performance will be mean absolute error (MAE) = 47 (hrs), mean absolute percentage error (MAPE) = 4.1%,root mean squared error (RMSE) = 70 (hrs) which is much poorer. Table 8 compares the performance levels of the gradient descent algorithm and the Levenberg-Marquardt algorithm.

How does the accuracy of remaining cycle time estimation affect the performance of the dispatching rule? To answer this question, we add a noise factor to the remaining cycle time before feeding it to the dispatching rule as follows: where follows the normal distribution. After a simulation study using FSMCT, the estimated remaining cycle time was compared with the actual value to evaluate the estimation accuracy. The relationship between the scheduling performance and the estimation accuracy is shown in Figure 4. Forty replications of a simulation experiment were run with FSMCT. The scheduling performance and the estimation accuracy were evaluated with the average cycle time and RMSE, respectively. Obviously, the more accurately the remaining cycle time can be estimated, the better the schedule will perform. Therefore, the efforts made in this paper to improve the accuracy of remaining cycle time estimation make sense.

3.4. The Slack-Diversifying Fuzzy-Neural Rule

In traditional fluctuation smoothing (FS) there are two different formulation methods with two separate strengths [11]. One method is aimed at minimizing the mean cycle time with FSMCT as follows: The other method is aimed at minimizing the variance of cycle time with FSVCT as follows: Jobs with the smallest slack values are given high priority. These two rules and their variants have been proven to be very effective in shortening the cycle times of wafer fabrication factories [11, 14, 24, 27]. In the traditional FSMCT rule, might be much greater than . As a result, the slack of a job is determined solely by . To tackle this problem, both terms in the FSMCT rule are normalized as follows: After normalization, both terms now range from 0 to 1. Subsequently, to improve the responsiveness of the FSMCT rule, the division operator is applied instead of the traditional subtraction operator as follows: However, this rule cannot be tailored to the wafer fabrication factory that is to be scheduled. To address this problem, the transition from a traditional FSMCT rule to its nonlinear form is analyzed as follows. The nonlinear form can be rewritten aswhere and . Conversely, the linear form can also be rewritten as These two formulas can be generalized into the following form:

Wang et al. derived [13] the slack-diversifying nonlinear fluctuation smoothing rule by diversifying the slack in the 1f-TNFSVCT rule. In this study, we diversify the slack in the 2f-TNFSMCT rule as follows: where There are many possible models that can form the combination of and . For example, However, (25) is difficult to deal with. For this reason, the following polynomial fitting techniques are used to convert it into a more tractable form: The mean absolute percentage error (RMSE) of (30) is less than 9% when . RMSE will not be a serious problem, since it is the value associated with the maximum that is to be found, not the value. These polynomial fitting techniques are especially effective when exceeds 1.

Applying (30) to (25) gives where To diversify the slack, the standard deviation of the slack is to be maximized as follows: which is equivalent to maximizing the following term: By taking the derivatives of (34) with respect to and and setting them equal to zero, we can obtain In addition, and must follow some model, for example, (27), (28), or (29). For example, if the nonlinear model in (28) is satisfied , then according to (35), while according to (36) The general solutions are too long to be presented here. Further, to guarantee a maximum, the second-order derivative must be negative as follows: In addition, it is possible that some jobs have very large or very small slacks, either of which will distort the calculation of . For this reason, such jobs are excluded. Further, the values of and influence the range of . For a fair comparison, should be divided by the range of to determine the optimal values of and .

An example is given in Table 9 to illustrate the procedure mentioned above, in which , , and . Assume that the nonlinear model with is used for the relationship between and . The optimal solution is = 0.53 and with the maximum equal to 9.21. In Figure 5 the results of the proposed methodology are compared with those of Wang et al.’s method. Obviously, the proposed methodology performed better at diversifying the slacks of jobs, which reduces the risk of misscheduling and promotes scheduling performance.

4. Numerical Simulation and Results

To evaluate the effectiveness of the slack-diversifying fuzzy-neural rule for job dispatching in a wafer fabrication factory, simulated data were used to avoid disturbing the regular operations of the wafer fabrication factory. Simulation is a widely used technology that can assess the effectiveness of a scheduling policy, especially when the proposed policy and the current practice are very different. The actual production environment is dedicated to make real products and is not available for algorithm testing. Therefore, a real wafer fabrication factory located in Taichung Scientific Park of Taiwan with a monthly capacity of about 25,000 wafers was simulated. That factory’s real time scheduling systems input information very rapidly into its production management information systems (PROMIS). The simulation program has been validated and verified by comparing the actual cycle times with the simulated values and by analyzing the trace report, respectively. The wafer fabrication produces more than 10 types of memory products and has more than 500 workstations for performing single wafer or batch operations using 58 nm ~ 110 nm technologies. Each job released into the fabrication factory is assigned one of three priorities, that is, “normal,” “hot,” and “super hot.” Jobs with the highest priorities will be processed first. The large scale of operations with reentrant process flows makes job dispatching in the wafer fabrication factory a very tough task. Currently, the longest average cycle time exceeds three months with a variation of more than 300 hours. The factory managers seek better dispatching rules to replace first-in first-out (FIFO) and EDD, in order to shorten the average cycle times and ensure on time deliveries to customers. One hundred replications of the simulation were successively run. The time required for each simulation replication was about 30 minutes using a PC with Intel Dual CPU E2200 2.2 GHz and 1.99G RAM. A horizon of twenty-four months was simulated.

To assess the effectiveness of the proposed methodology and to make comparisons with some existing approaches—FIFO, EDD, shortest remaining processing time (SRPT), CR, FSVCT, FSMCT, and a nonlinear fluctuation smoothing rule (NFS) [4]—all of these methods were applied to schedule the simulated wafer fabrication factory with the data of 1000 jobs; the collected data were then separated by product types and priorities. That is about the amount of work that can be achieved with 100% of the monthly capacity. Some cases produced insufficient data, so the present paper does not discuss those cases.

The due date of each job was determined as follows. The FCM-BPN approach was applied to estimate the cycle time; the Levenberg-Marquardt algorithm, rather than the gradient descent algorithm, was applied to speed up the network convergence. Then, we added a constant allowance of three days to the estimated cycle time, that is, , to determine the internal due date as follows:

Jobs with the highest priorities are usually processed first. In FIFO, jobs were sequenced on each machine first by their priorities, then by their arrival times at the machine. In EDD, jobs were sequenced first by their priorities, then by their due dates. With SRPT, the remaining processing time of each job was calculated. Then, jobs were sequenced first by their priorities, then by their remaining processing times. With CR, jobs were sequenced first by their priorities, then by their critical ratios as follows: FSVCT and FSMCT consisted of two stages. The first stage scheduled jobs by FIFO; the remaining cycle times of all jobs were recorded and averaged at each step. The second stage applied FSVCT or FSMCT policy to schedule the jobs based on the average remaining cycle times obtained earlier. In other words, jobs were sequenced on each machine first by their priorities, and then by their slack values, which were determined by (18) and (19). In NFS, the slack of a job was defined as follows: where . After a preliminary experiment, was set to 0.8. In the proposed methodology, the nonlinear model with was used.

Subsequently, the average cycle time and cycle time standard deviation of all cases were calculated to assess the scheduling performance. Comparisons for the average cycle time used FSMCT policy; comparisons for cycle time standard deviation used FSVCT. The results are summarized in Tables 10 and 11. Both products A and B are dynamic random access memory products. The main difference between the two products is their storage capacities.(1)For the average cycle time, the proposed methodology outperformed the baseline approach, FSMCT policy. The average advantage was about 17%.(2)In addition, the proposed methodology surpassed the FSVCT policy in reducing cycle time standard deviation. The most obvious advantage was 74%.(3)As expected, SRPT performed well in reducing the average cycle times, especially for product types with short cycle times (e.g., product A) but sometimes gave an exceedingly bad performance with respect to cycle time standard deviation. If the cycle time is long, the remaining cycle time will be much longer than the remaining processing time; this makes SRPT ineffective. SRPT is similar to FSMCT. Both try to make all jobs equally early or late.(4)The performance of EDD was satisfactory for product types with short cycle times. If the cycle time is long, it is more likely to deviate from the prescribed internal due date, which makes EDD ineffective. That becomes serious if the percentage of the product type is high in the product mix (e.g., product type A). CR has similar problems.(5)The Wilcoxon signed-rank test [39], a commonly used nonparametric statistical hypothesis test for comparisons of two related samples or repeated measurements on a single sample, was used in this study to assess whether population means differed or not. The results are summarized in Table 12. The null hypothesis was rejected at , which showed that the slack-diversifying fluctuation-smoothing rule was statistically superior to four existing approaches to reduce the average cycle time. With regard to cycle time standard deviation, the advantage of the slack-diversifying fluctuation-smoothing rule over SRPT and FSVCT was significant.(6)The slack-diversifying 2f-TNFSMCT rule was also compared with the traditional one without slack diversification. Taking product type A with normal priority as an example, the comparison results are shown in Figure 6. Obviously, the slack-diversifying rules dominated most of the traditional rules without slack diversification. According to these results, the treatments carried out in this study did indeed improve on the performance levels of the traditional policies.(7)The proposed methodology can be decomposed into two parts. In the first part, the remaining cycle time of a new job is estimated using FCM-BPN, based on the historical data retrieved from PROMIS. In the experiment, it took less than 2 minutes to run all MATLAB programs and generate the remaining cycle time estimate. In contrast, Chen and Wang’s method used the gradient search algorithm, and it took about 15 minutes to train the BPN. In the second part, the slacks of all jobs are calculated according to the slack-diversifying 2f-TNFSMCT rule. That can be done almost instantly.

The slack-diversifying 2f-TNFSMCT rule has been proven effective in real world situations because it has been applied to schedule a wafer fabrication factory. However, it has not been possible to experiment with a large assortment of scheduling policies in that wafer fabrication factory. In the past, FIFO and EDD have been applied to this wafer fabrication factory, and their performance statistics have been analyzed. For these reasons, the performance level of the slack-diversifying 2f-TNFSMCT rule was compared with those of FIFO and EDD. For example, consider the case in which the majority of the factory’s capacity is occupied. Figures 7 and 8 compare the performance levels of the three approaches. With regard to the average cycle time, the slack-diversifying 2f-TNFSMCT rule surpassed the two old rules. The performance of the slack-diversifying 2f-TNFSMCT rule was also comparable to that of FIFO for reducing cycle time standard deviation. Both results supported the practicability of the proposed slack-diversifying 2f-TNFSMCT rule.

5. Conclusions and Directions for Future Research

To optimize job dispatching in a wafer fabrication factory is a challenging but important task. This study attempts to innovate by a slack-diversifying fuzzy neural rule that optimizes the average cycle time and cycle time standard deviation for job dispatching in a wafer fabrication factory.

The proposed methodology applies soft computing techniques like fuzzy classification and artificial neural network prediction. An effective fuzzy-neural approach is applied to estimate the remaining cycle time of a job, which is empirically shown to ameliorate scheduling performance. This slack-diversifying fuzzy-neural rule is a modification of the 2f-TNFSMCT rule.

After a simulation study, we observed the following phenomena:(1)through improving the accuracy of remaining cycle time estimation, the performance of a scheduling rule can indeed be strengthened;(2)optimizing the adjustable factors in the 2f-TNFSMCT rule appears to be an appropriate tool to enhance the scheduling performance of the rule.

However, a complete assessment of the effectiveness and efficiency of the proposed methodology requires actual application in a real-world wafer fabrication factory. Future studies can optimize other rules in the same way, taking into account the great uncertainty inherent in wafer fabrication systems [40] and possibly using a more effective data mining approach [41].

Acknowledgment

This work was supported by the National Science Council of Taiwan.