#### Abstract

There is a growing concern that business enterprises focus primarily on their economic activities and ignore the impact of these activities on the environment and the society. This paper investigates a novel sustainable inventory-allocation planning model with carbon emissions and defective item disposal over multiple periods under a fuzzy random environment. In this paper, a carbon credit price and a carbon cap are proposed to demonstrate the effect of carbon emissions’ costs on the inventory-allocation network costs. The percentage of poor quality products from manufacturers that need to be rejected is assumed to be fuzzy random. Because of the complexity of the model, dynamic programming-based particle swarm optimization with multiple social learning structures, a DP-based GLNPSO, and a fuzzy random simulation are proposed to solve the model. A case is then given to demonstrate the efficiency and effectiveness of the proposed model and the DP-based GLNPSO algorithm. The results found that total costs across the inventory-allocation network varied with changes in the carbon cap and that carbon emissions’ reductions could be utilized to gain greater profits.

#### 1. Introduction

The need for environmental awareness has affected several aspects of the global economy such as supply chain management. Traditionally, supply chain network design problems have tended to be analyzed from a fixed and variable cost perspective without any consideration of the carbon footprint factor [1, 2]. However, this analysis behavior has now been forced to focus on more environmentally conscious supply chain planning optimization models in which economic aspects (profit maximization and cost minimization) are integrated with clear environmental goals such as carbon footprint reductions [3–5]. There has been an increasing research interest in sustainable supply chain network design, with most suggesting that environmental sustainability be viewed as an opportunity rather than a risk [6–8]. Recently, many companies have realized that sustainability is a bottom-line requirement and therefore can no longer be ignored. Despite all these studies, there is still an urgent requirement to develop quantitative models that address these sustainability issues.

Current global efforts to minimize environmental impacts have encouraged companies to change their practices to increase efficiency and reduce negative externalities [9], which has led to a higher focus on sustainable practices such as recycling and waste management [10, 11]. Shaw et al. [12] designed a sustainable location-allocation model that considered the consumers’ environmental behavior, which affected consumer demand for low carbon emissions products. Torabi et al. [13] proposed a generic model for a sustainable wine manufacturer-distribution network that encompassed economic, environmental, and social objectives. Diabat and Al-Salem [14] developed a nonlinear mixed integer program that minimized the cost of a stochastic inventory-allocation network that included a carbon emissions cost term to account for environmental concerns. They proposed a concept of emission cap, which means the company needs to pay for the amount of carbon emission that exceeds the carbon cap in their model. Sustainable supply chains can be achieved by developing a supply chain that either incorporates environmental concerns or incorporates reverse logistics such as recycling. The most notable international framework for minimizing greenhouse gas (GHG) emissions was the Kyoto Protocol, an international agreement ratified by the United Nations, in which emissions trading schemes and a carbon credit market were outlined so that countries who had not exceeded their nominated carbon emissions targets could sell the excess to other countries, thereby giving GHGs the status of an international commodity [15].

The uncertain competitive environment means that inventory-allocation management needs to be more flexible and efficient as enterprises must not only reduce their cost of storage and distribution, but also ensure the downstream supply chain retailers are not unduly affected because of out of stock items at a critical time. Without proper inventory control, a retailer’s loss can directly affect the interests across the whole supply chain; therefore, supply chain inventory and distribution management has become an important element of supply chain efficiency in the past few years [16]. Time has also become a very important factor when managing products, especially when a number of time-periods are involved. Pal et al. [17] proposed a model to determine order quantities between suppliers at the initial stage and the optimal inventory levels over multiple periods for all stages in the inventory-allocation network. Radhi and Zhang [18] extended multiobjective nonlinear mixed integer models for multiperiod allocation planning problems that involved multiple suppliers and multiple products. In addition, because production systems are not perfect, defective products randomly appear, the production of which follows a probability distribution. Kennedy and Eberhart [19] considered a single-vendor, single-buyer inventory model that considered the impact of varying percentages for defective goods, storage costs, and disposal schemes. There has also been significant research interest in different aspects of imperfect production inventory models [20]. However, as most of these studies have tended to focus on economic order quantities [21] or economic production quantities [22], defective item disposal has not been applied across the whole inventory-allocation network. Further, in most studies, the defective item rate has been assumed to be a constant [23], which does not accurately reflect reality. Product defect rates are characterized by both fuzzy uncertainty and randomness, or the so-called twofold uncertainty. Therefore, an inventory-allocation management dynamic programming model with a fuzzy random defect rate and fuzzy annual demand is proposed in this paper.

In recent years, the higher levels of uncertainty within inventory-allocation management have been shown to be extremely costly for manufacturers and the total supply chain [24, 25]; therefore, inventory-allocation models that can reduce or eliminate uncertainty to avoid incorrect and costly decisions are needed. As a general theoretical framework to model practical problems with unknown parameters, uncertain random programming was introduced by Lin [26], which was then extended to uncertain random multiobjective programming [27] and uncertain random multilevel programming [28]. More recent studies on the application of fuzzy set theory to inventory-allocation problems can be found in [29, 30].

In this paper, an inventory-allocation planning model with carbon emissions and defective item disposal under a fuzzy random environment is considered, with annual demand, transportation costs, inventory conversion factors, and product defect percentages being fuzzy random variables. Of the many heuristics and metaheuristics algorithms, global best, local and near neighbor best particle swarm optimization (GLNPSO) [31] has been proven to be a powerful competitor in the field of nondeterministic polynomial-time- (NP-) hard problem optimization. Because of the relationships between the state equation, the constraint conditions, and the objective functions, a dynamic programming-based GLNPSO (DP-based GLNPSO) algorithm was developed [32, 33] which reduced the particle dimensions using the state equation. In this paper, a DP-based GLNPSO algorithm is developed to solve the research problem model, in which initialization and adjustment methods are developed to avoid infeasible solutions.

The main contributions of this paper are as follows. A sustainable inventory-allocation model with carbon emissions and defective item disposal is developed, for which several constraints are considered to make the model more applicable to reality. Then, a modified version of the particle swarm optimization algorithm called the DP-based GLNPSO is constructed to solve the developed model. Finally, a representative example is applied to tune the parameters of the DP-based GLNPSO. The remainder of this paper is organized as follows. The problem statement for the inventory-allocation planning model with carbon emissions and defect item disposal (IAPCEDID) under an uncertain environment is introduced in Section 2. In Section 3, the suggested model and its formulations are described. Section 4 describes the development of the DP-based GLNPSO to solve the IAPCEDID, and the efficiency of the proposed model is illustrated by a representative example in Section 5. Finally, in Section 6, the conclusions and limitations are discussed and future research directions elaborated.

#### 2. Key Problem Statement

In the supply chain, inventory-allocation management with effective quality and carbon emissions controls is essential for an efficient manufacturer-retailer network. As retailers order products from different manufacturers at specified time-periods, there is a multiple stage problem planning horizon, with replenishment taking place at the beginning of each of these stages [34, 35]. With government regulations on carbon emissions (carbon cap), transport needs to maintain carbon emissions below a certain level. This sustainable manufacturer-retailer network is based on the allocation of carbon units in line with established carbon emissions reduction targets. At the end of each period, the emissions values of the company are verified, and each emitter must then offset its carbon emissions against the target established by the government. The discrepancy between the imposed target and the actual emissions may be offset by the company purchasing carbon units in the domestic market [36]. Alternatively, for each ton of CO_{2} emissions avoided, the company receives a carbon emissions certificate that can be sold on the futures market.

In many inventory-allocation problems, all products are deemed to be of suitable quality; however, in the real world, there is a probability that some items will be defective, the percentage of which is uncertain. Items are classified as being of suitable quality or as being defective, with all defective items found during the screening process being returned to the manufacturer. For the sake of convenience, the manufacturers take back the defective items as a batch in the next shipment [28, 37, 38]. However, if there are defective items, shortages may be difficult to avoid. Therefore, a penalty cost is considered to reduce the losses because of possible shortages. Due to the uncertain constraints, manufacturers are not able to produce items at more than a specified value and must also provide the products to the retailers under all-unit and incremental quantity discount policies [39]. From the above, a sustainable inventory-allocation planning model with carbon emissions and defective item disposal (IAPCEDID) is considered based on dynamic programming under a fuzzy random environment [40, 41]. The flow of items in the proposed supply chain network is shown in Figure 1. The proposed IAPCEDID problem can be described as follows. There are manufacturers, warehouses, and retailers. The manager purchases the required items from specific manufacturers at the beginning of each stage. On receipt from the manufacturer, items are classified as suitable or defective. Defective items are returned to the manufacturer while suitable items are transported to the corresponding warehouses and allocations made according to retailer demands.

There are the following assumptions in this study: (1) Item demand, transportation costs, and the percentage of defective items in each stage are regarded as fuzzy random variables. (2) The span of each stage is identical. (3) Shortages are allowed and a penalty cost is applied to reduce losses because of shortages. (4) The manufacturer is liable for the costs incurred for returned defective items. (5) Every item type has a corresponding warehouse with a maximum storage capacity. The items are first transported from the manufacturers to the warehouses and then from the warehouses to the retailer stores. (6) The order lead time is negligible. At the beginning of each stage, all purchased items arrive at the corresponding warehouses. (7) The retailer product demands are independent of one another and are fixed in a stage.

#### 3. Modeling

In this section, a dynamic programming model for the IAPCEDID that considers fuzziness and randomness is constructed.

##### 3.1. Notations

The following notations are adopted.

*Indices* : stage index; . : item index; . : retailers index; . : an index for the price break points; .

*Fuzzy Random Variables* : the unit transport price of Item per kilometer in the th stage. : the demand for Item in the th stage at retailer . : the demand for Item in the th stage. : the conversion factor of Item in the th stage. : the fraction defective of Item in the th stage.

*Decision Variables* : the inventory level of Item in the warehouse at the beginning of the th stage. : the purchase quantity of Item in the th stage.

*Parameters* : the maximum purchase quantity of Item in the th stage. : the minimum purchase quantity of Item in the th stage. : the maximum inventory level of Item in each stage. : the initial inventory level of Item in the warehouse at the beginning of the first stage. : the terminal inventory level of Item in the warehouse at the end of the whole duration. : the unit storage cost of Item in each stage. : the function of current inventory for Item in the whole process. : the distance between manufacturer and the corresponding warehouse. : the distance between warehouses and store . : the inspection price of Item . : the return price of defective Item . : the stock out penalty price of defective Item . : the unit cost of the item from manufacturer at th price break point. : the th price break point for the item in the th stage. : total purchase budget of the retailer for the planning horizon. : fuel consumption per kilometer for transportation vehicle. : CO_{2} emission for unit gasoline fuel for transportation vehicle. : carbon cap over the network. : carbon credit price per ton.

##### 3.2. Objective Functions

The objective function defines the total cost of the complete manufacturer-retailer network. The aim of the project manager is to determine the order quantity and inventory level for each item in each stage so that total manufacturer-retailer network costs are minimized. The total costs are made up of purchasing costs, transport costs, inventory costs, penalty costs, and carbon emissions costs.

In the proposed inventory-allocation model, the retailer orders products under several discount policies. In this paper, an incremental quantity discount is considered, for which the products are delivered in known packets containing a certain number of items. In the incremental quantity discount policy, the purchase cost of Item in the th stage depends on the ordered quantity. Each price discount-point is obtained byTherefore, the purchase cost under this policy isLet be the unit inventory cost of Item . However, as not all items are stored in the warehouse over the whole stage, the actual inventory cost is less than . To deal with this, an inventory conversion factor is introduced to balance the difference between the actual inventory quantity and in the th stage. is the function for the current inventory for Item across the whole manufacturer-retailer network, in which a unit of Item is one stage and ; therefore, the inventory conversion factor can be defined as follows:Let be the inspection fee for Item . Before being transported to the warehouse, each item is inspected, after which all defective items are returned to the manufacturer and a return price is requested. As the purchase quantity is , the all inspection fee should naturally be . Let be the percentage of defective Item in the th stage. Let be the total inventory price, soAs the transportation distances between the manufacturers, warehouses, and retail stores are all different, the transportation vehicles are also different, making total transportation costs difficult to determine. Let be the transportation price of Item per kilometer, be the transportation distance between the manufacturers and the corresponding warehouses, and be the transportation distance between the warehouses and the retail stores. Hence, is the transportation quantity for Item from the manufacturer to the corresponding warehouse in the th stage, and is the demand at each retail store in the th stage; therefore, is the total transportation costs for Item over the manufacturer-retailer network as follows:

A penalty cost is applied when the demand for Item cannot be met. Let be the penalty if the demand for Item cannot be met in the th stage. Let be the penalty cost for Item , which can be determined as follows:

The carbon emissions costs are the penalties/rewards in a carbon constrained scenario. These two terms represent the transport emissions from the manufacturers to the warehouses and from the warehouses to the retail stores. Let be the fuel consumption for a transportation vehicle and be the CO_{2} emissions from the gasoline; therefore, the vehicle’s carbon emissions per kilometer are .

Let be the carbon cap during transport and a similar carbon price () be considered for the purchase as well as the sale of carbon credits [7]; therefore, is the carbon emissions cost for Item over the complete manufacturer-retailer network, as follows:

As it is very difficult to deal with objective functions that have fuzzy random factors, Khan et al. [42] developed a method to convert fuzzy random variables in both the objective function and the constraints into fuzzy variables similar to trapezoidal fuzzy numbers. Based on the theory proposed by Heilpern [43], without loss of generality, the expected value operator is used to convert the uncertain model into a deterministic model, which can then be used to transform the fuzzy random objective functions and constraints into crisp equivalences.

##### 3.3. State Equation

The state equation describes the relationship between stage th and stage th. Let be the inventory level and be the demand. If the item is deemed suitable after inspection, it is then transported to the warehouse; therefore, the inventory level of Item in the corresponding warehouse at the beginning of the th stage, , is , or is zero. The relationship between the inventory level, purchase quantity, and demand can be modeled as follows:

##### 3.4. Initial and Terminal Conditions

The initial conditions describe the storage level for Item before the beginning stage. The terminal conditions describe the storage level for Item at the end of the manufacturer-retailer network. Let and be the initial and terminal inventory levels for Item . Generally, in a practical condition, the two conditions above can be settled as and . The initial condition and terminal condition can be presented mathematically as follows:

##### 3.5. Constraint Conditions

If a manager decides to purchase Item in stage th, let and be the minimum purchase quantity for Item and the maximum purchase quantity for Item in stage th; the purchase quantity for Item in each stage must be within this specified range:The retailer has financial constraints. Let be the total purchase budget; therefore, should be within the budget.As maximum storage levels must be taken into consideration, the inventory level of each item in each stage cannot exceed the maximum storage level. Let be the maximum storage for Item . The storage level should satisfy the following condition:

##### 3.6. Global Model

The IAPCEDID determines the quantity of item that needs to be purchased from the manufacturer and distributed to the retailers in stage to minimize the total expected cost function under the considered constraints and a carbon emissions cost that is added to account for the environmental considerations. The model proposed here is based on dynamic programming over a planning horizon that has multiple periods with initial and terminal conditions and state equation constraints. The objective function is made up of the purchase costs (), transportation costs (), inventory costs (), penalty costs (), and carbon costs (). As the items are classified as suitable or defective, the processes for both item inspection and defective item disposal are included. In summary, the global model is as follows:

#### 4. Dynamic Programming-Based GLNPSO

##### 4.1. General Mechanism of DP-Based GLNPSO

Based on the particle swarm optimization (PSO) proposed by Kennedy [31], the main PSO algorithm is developed based on a GLNPSO with multiple social structures [44]. In this study, based on an iterative dynamic programming model, a DP-based GLNPSO algorithm is developed to solve the problem. The proposed DP-based GLNPSO is a variant of the GLNPSO, with the main difference being the dimensionality reduction of the variables. With appropriate model transformations, a dynamic programming-based particle swarm optimization with a multiple social learning structures (DP-based GLNPSO) algorithm is developed to solve the IAPCEDID. The goal is to search for satisfactory solutions to (14) by constantly moving the direction of the particles towards optimization. The notations needed are as follows: : iteration index, . : dimension index, . : particle index, . : inertia weight. : velocity of the th particle at the th dimension in the th iteration. : position of the th particle at the th dimension in the th iteration. : personal best position of the th particle at the th dimension. : local best position of the th particle at the th dimension. : near neighbor best positions position of the th particle at the th dimension. : global best position at the th dimension. : personal best position acceleration constant. : global best position acceleration constant. : local best position acceleration constant. : near neighbor best position acceleration constant. : vector position of the th particle, . : vector velocity of the th particle, . : vector personal best position of the th particle, . : vector local best position of the th particle, . : vector near neighbor best position of the th particle, . : vector global best position, . : the th part of the th particle in the th.

In the GLNPSO, the algorithm is initialized with a swarm of th random particles. Each particle consists of the personal best position , the global best position , the local best position , and the near neighbor best position . The local best is the best position for several adjacent particles and the near neighbor best is a social learning behavior that is determined based on the fitness-distance-ratio (FDR). Each particle is represented by its position in a space, where is the problem dimension. Unlike the GLNPSO, using the state equation in the dynamic programming model [32], the DP-based GLNPSO can reduce the particle dimensions, the details for which are shown in Figure 2. In this problem, the problem dimension contains decision variables and state variables , which are, respectively, related to the objectives and constraints. It should be noted that if the decision variables are known, then the state variables can be determined using the state equation.

The essential difference between the DP-based GLNPSO and the GLNPSO is that the DP-based GLNPSO takes advantage of the iterative mechanism in the dynamic programming model to reduce the dimensions of the particles, thereby significantly reducing the solution search space. It should be noted that if a GLNPSO were used in this study, the particle dimensions would be compared to for a DP-based GLNPSO particle.where can be the th part of the th particle in the th generation. Note that every part of a particle is a vector, which can be denoted aswhere is the th dimension of for the th particle in the th generation; . In order to be in line with the expression , .

##### 4.2. Initializing Strategy

Based on the state equation from dynamic programming theory, an initialization strategy is used to initialize the particles and avoid an infeasible position.

*Step 1. *Set , .

*Step 2. *Initialize by generating a random real number within .

*Step 3. *Then, based on (note that , where denotes the initial inventory level of Item in the warehouse at the beginning of the first stage). If , then go to Step . Otherwise, return to Step .

*Step 4. *If the stopping criterion is met, that is, and , then the initialization for the th particle is completed. Otherwise, and return to Step .

##### 4.3. Adjusting Strategy

An adjustment strategy is used to generate the particle and adjust it to the feasible region. After updating to avoid an infeasible position, the particle is adjusted as follows.

*Step 1. *Set , .

*Step 2. *If , then . If , then .

*Step 3. *Based on = .

*Step 4. *If , then go to Step . Otherwise, let ; ; return to Step .

*Step 5. *If , then and go to Step . Otherwise, and return to Step .

*Step 6. *If the stopping criterion is met, that is, and , then the adjustment for the th particle is completed. Otherwise, and return to Step .

##### 4.4. Updating Strategy and Decoding Strategy

Throughout the DP-based GLNPSO optimization process, the social learning behavior component includes the global best, the local best, and the near neighbor best. The search benefits from the sharing of information with the whole population about the particles’ discoveries and past experiences. In each generation, the is calculated as the best position the swarm reaches; the is calculated as the best position from several adjacent particles; the is a social learning behavior which is determined based on the fitness-distance-ratio (FDR) [45]; and is the inertia weight used to control the impact of the previous velocities on the current velocity, which influences the trade-off between the global and the local exploration abilities during the search. The particle then updates the positions using the new velocity, after which each particle updates its velocity to approach the new , , , and :The DP-based GLNPSO decoding strategy transforms the particle into a corresponding purchase quantity for each item at the beginning of each stage. Based on the state equation; , decoding in the th dimension into the purchase quantity of item at the beginning of stage . The decoded result can be represented as .

##### 4.5. Overall Procedure

Based on the above sections, the overall procedure for the DP-based GLNPSO algorithm can be given. The algorithm is shown in Figure 3, the details for which are as follows.

*Step 1. *Initialize the particle and using the initialization strategy.

*Step 2. *Check the constraints based on the DP-based GLNPSO, and avoid an infeasible position.

*Step 3. *Calculate the initial particles to generate the fitness value, , , and the .

*Step 4. *Update particle positions and velocities, for .

*Step 4.1*. Update the personal best, if , .

*Step 4.2*. Update the global best, if , .

*Step 4.3*. Update the local best, and set , which obtains the least fitness value to be .

*Step 4.4*. Generate the near neighbor best, and set to maximize the FDR according to (20), where is .

*Step 4.5*. Update velocity and particle positions according to (19) and (21).

*Step 5. *Adjust the particles to the feasible region using the adjustment strategy.

*Step 6. *If the stopping criterion is met, go to Step ; otherwise, and return to Step .

*Step 7. *Determine the fitness value and global best position.

*Step 8. *Decode the particle and integrate and (for , ).

#### 5. Case Study

To illustrate the performance of the proposed DP-based method and to show the effect of a carbon cap on the optimization results, the method was applied to a particular case. A sustainable logistics item structure made up of five main parts is considered, as shown in Figure 1, in which each stage is one month, and four periods, five retail stores, and four items with corresponding warehouses are considered. After the items are inspected, suitable items are transported to the warehouses and defective items returned to the corresponding manufacturers. In this case, a strategy is generated to minimize the inventory, allocation, and carbon emissions costs. The carbon emissions can be converted into the carbon credits cost price, which has the same dimensions as the economic costs [9].

This case has four items and five retail stores. Each retail store’s demand for each item for each month is shown in Table 1; the purchase information and item inventory information are shown in Tables 2 and 3; and the distribution information is shown in Tables 4 and 5. All fuzzy random variables are represented by triangular fuzzy numbers, with the parameters obeying a normal distribution. The fuel consumption () is 0.245 (l/km), CO_{2} emissions for a unit of gasoline are 2.63 (kg/l), and the carbon credit price is 189.29 (CNY/ton). These emissions’ parameters were referenced from the Environmental Data for International Cargo Transport & Road Transport [46].

##### 5.1. DP-Based GLNPSO Parameter Selection

The IAPCEDID parameters were determined based on practical situations and past studies to observe the behavior of the algorithm at different parameter settings. From a comparison of several parameter sets including the acceleration constants , , , and and the inertia weight , the most reasonable parameters were identified. Through further experiments, and were found to be the most suitable to control the impact of the previous velocities on the current velocity and to influence the trade-off between the global and local experiences. The other parameters were selected by comparing the results with the observations from the dynamic search swarm behavior. The selection of the acceleration coefficients , , , and affects both the convergence speed and the ability to escape from the local minima. In this paper were chosen as the most suitable. For maximum generation and population size , the maximum iteration’s influence on the IAPCEDID performance was tested to determine suitable parameters. In the test, the population size was set at 10 to 30 with a step-length of 5 and the stopping criteria was from 400 to 600 with a step-length of 50; therefore, there were 25 maximum iteration groups. Figures 4(a) and 4(b) show the average results and computing times. The horizontal TN illustrates the and groups; for example, “1~5” represents five different groups. When , increases from 400 to 600 with a step-length of 50, with the remainder following the same analogy. The IAPCEDID was run 30 times for each group and the specific optimal results are presented in Figure 4(b). From Figure 4(a), when was from 10 to 20, the maximum iteration had a marked impact on the results and the particles traded in a relatively tight range. The best result touched the bottom when and . From Figure 4(b), it can be seen that the maximum iteration significantly influenced computing time. Further, a significant positive correlation between the average computing time and the maximum generation was observed when the population size was a fixed value. As mentioned, the best values for maximum generation and population size were, respectively, identified as and .

**(a) The average optimal result**

**(b) The computing time**

##### 5.2. Result Analysis

The experiments described in this section were conducted using MATLAB language, and the DP-based GLNPSO based approach was developed using MATLAB software. Using the data in Section 5.2, MATLAB 7.10.0 R2010a on a core i5-5200U, 2.19 GHz clock pulse with 3.88 GB memory was used to test the performance of the method.

Inventory and allocation decisions with rejected items over multiple stages are very important to the IAPCEDID. The specific purchase and inventory strategies for the four different items over the five stages without carbon cap consideration are shown in Table 6. The optimal results for the purchase, inventory, transport, and penalty costs are shown in Table 7. The total optimization cost was (CNY) and the losses caused by defective items were (CNY).

The specific purchase and inventory strategies that consider carbon emissions in an inventory-allocation network are almost the same as when not considering the carbon emissions. However, the total costs and carbon costs summarized in Table 8 indicate that a carbon cap can have a serious effect on total costs. As the carbon cap increased from to , the total costs gradually decreased from (CNY) to (CNY). With an increase in the carbon cap, economic costs experienced a downward trend with the same carbon credit price. When the carbon cap increased from to , the carbon costs reduced from (CNY) to (CNY). This indicates that there is a substantial incremental increase in the manufacturer-retailer network costs if carbon emissions are considered. This calculation should convince decision makers to acknowledge the influence of the carbon emissions costs and persuade them to take measures to reduce their emissions as much as possible. When the carbon cap increased from to , the carbon costs continued to decrease to negative. According to the Kyoto Protocol, companies that do not reach their carbon emissions limits can sell the excess as carbon credits to other companies. In these cases, the company’s carbon costs would begin to decline towards a negative value, at which point, the company would be earning credits; in other words, a negative cost means that the firm is actually making money by reducing carbon emissions. The sensitivity of the total inventory-allocation network costs to a change in the carbon cap in the mathematical model is shown in Figure 5(a). Total costs were found to decrease with an increase in the carbon cap, and a linear relationship was found between the carbon cap and overall costs. It can be seen from Figure 5(b) that as the carbon cap decreased, the carbon costs gradually took a larger share of the total costs. Further, it demonstrated that the carbon cap was conducive with economic growth; however, as the environment becomes increasingly damaged, the carbon cap would tighten, eventually causing a negative economic effect. In fact, from the enterprise point of view, as the carbon cap has strong externality, it is better to select the best decision based on the decision makers’ preferences under different carbon caps.

**(a) Total cost according to different carbon caps**

**(b) Carbon cost relative to the total cost**

Environmental considerations can also have a substantial impact on the inventory-allocation system, especially when the network dimension is large or when more stringent restrictions are placed on carbon emissions. In this network, when the variety of items increased, the purchasing quantities become extremely large and the distribution network became more complex. This indicates that it is necessary to construct a manufacturer-retailer model that considers the carbon cap so that an appropriate strategy can be quickly chosen when there are changes in the external environment and also to provide different strategies for decision makers who have different preferences. Therefore, it is recommended that companies begin working towards sustainable inventory-allocation networks given that there is a global movement for a reduction in carbon emissions.

##### 5.3. Algorithm Comparison

With the development in technology, dynamic programming has become a counterpart to the PSO when dealing with different optimization problems. The average optimal result proposed by the DP-based GLNPSO is shown in Figure 6(a). To demonstrate the feasibility and effectiveness of the proposed DP-based GLNPSO, it was compared to a standard PSO. To conduct the comparison under a similar environment, the parameters selected for the DP-based GLNPSO were also adopted for the standard PSO; , , , the acceleration constant , and the inertia weight and . The performance of the iterative process for each algorithm is shown in Figure 6(b) and the comparison details are shown in Table 9. It can be concluded that (1) both the DP-based GLNPSO and the PSO were able to obtain optimal solutions; however, the computation time for the DP-based GLNPSO was faster than for the PSO. (2) the DP-based GLNPSO converged faster, indicating that the DP-based GLNPSO needs less iterations to find the optimal solutions. (3) the DP-based GLNPSO had a more stable tendency than the standard PSO when searching for the optima, while the standard PSO had a tendency to occasionally fall into a local optimum. As shown from the above comparisons, it can be concluded that the DP-based GLNPSO is able to produce sufficient feasible solutions for the IAPCEDID.

**(a) The average optimal result**

**(b) Iterative process of DP-based GLNPSO and PSO**

#### 6. Conclusion

In this paper, a sustainable inventory-allocation planning model with carbon emissions and defective item disposal (IAPCE DID) under a fuzzy random environment was presented. The aim of the model was to find the optimal purchase quantities so as to minimize total network costs, which were made up of the purchase costs, inventory costs, transport costs, penalty costs, and carbon costs. In our model, inventory and distribution planning under a fuzzy random environment was considered with annual demand, transportation costs, inventory conversion factors, and the percentage of defective items being fuzzy random variables. The findings in this research extended those of previous studies, most of which have assumed no defective items in the purchase process. When considering the price of carbon credits, the carbon emissions can be converted into carbon costs, which have the same dimensions as the economic costs. Carbon costs were added to the model to analyze the impact of a carbon cap on total costs. It was apparent that such an extension was necessary for decision makers to balance operational costs on the one hand and the environmental impact on the other. Considering the complexity of the model, a heuristic solution algorithm was proposed to solve this problem, called a dynamic programming-based particle swarm optimization with a multiple social structures (DP-based GLNPSO) algorithm with fuzzy random simulation. A brief comparison was made between the DP-based GLNPSO and the classic PSO to further illustrate the merits of the algorithm.

This study expanded existing research on sustainability in supply chains and paved the way for the development and implementation of sustainable inventory-allocation networks, which can guide managers to better evaluate the sustainable practices in their manufacturer-retailer networks. This framework can assist managers to simultaneously achieve economic growth and environmental protection. The suggested model can be extended by considering dissimilar carbon credit pricing as well as dealing with very strict carbon footprint control scenarios.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was supported by National Natural Science Foundation of China (Grant no. 71640013), National Planning Office of Philosophy and Social Science (Grant no. 14BGL055), Human Social Science for Universities of Hebei (Grant no. BJ2016057), and System Science and Enterprise Development Research Center (Grant no. Xq16C10).