Abstract
The development of big data and artificial intelligence ensures the industrial innovation of traditional commodities. Economic transformation and upgrading has become the general trend of social development. Based on the public data of the Rossmann store, a wellknown German chain store, and the consumption prediction model of daily necessities based on random forest and GBRT is proposed in this paper. By combining the model to initialize residuals, not only the training speed of the model is effectively improved but also more accurate prediction is obtained. Based on the analysis of the big data, this paper puts forward the influence of daily necessities consumption forecast on the upgrading of industrial structure and constructs a value creation and transmission system for daily necessities, so as to promote the rationalization level of the consumer industrial structure and provide a reference for the upgrading of the same type of commodity industrial structure.
1. Introduction
As a “dark horse,” the digital economy is an important force to upgrade the industrial structure [1]. In the tide of China’s economic development in the past decade, the development of daily commodities cannot be ignored, and the trading of daily commodities has been moved from offline to online, which has prompted the changes of people’s consumption patterns and payment methods. With the deepening integration of China’s consumer economy and industrial economy [2], the development of commodity consumption industry is bound to promote the transformation of China’s industrial structure. In view of this, studying the influence of the commodity consumption structure on transformation and upgrading of China’s industrial structure under big data plays an important role in exploring whether it can become a feasible path to promote its transformation and upgrading.
As the consumption of dailyuse commodities is not only influenced by factors such as the supply and demand relationship of the commodity’s own quality market [3] but also closely related to objective environmental factors such as holiday effect [4], product promotion [5], consumption competition layout [6], and the changes in this part are often nonlinear and random. Therefore, under the background of the prevalence of big data, it is difficult to predict daily necessities consumption with simple linear model. In view of this, it is necessary to dig out the influence of various nonlinear factors on sales results, avoid the disadvantage that some nonlinear models are easy to fall into local minima and slow in convergence, and analyze the existing consumption structure efficiently and accurately.
The upgrading of China’s industrial structure not only require to create new consumption demand but also needs to promote economic development [7], so as to improve the per capita income level, affect the consumption structure of residents, and greatly promote the optimization of the consumption structure of residents. Therefore, starting from this problem, this paper makes a lot of analysis on the forecast data. In addition, a consumption forecast model based on random forest and GBRT is designed, and excellent results are obtained through experimental analysis, which is helpful to correctly analyze the consumption structure of residents’ daily necessities, help enterprises reduce sales costs and rationally allocate resources, and finally promote the upgrading of our consumption industrial structure.
2. The Role of Optimization of Daily Necessities Consumption Structure in Industrial Upgrading under Big Data
2.1. Overview of Consumption Structure and Industrial Upgrading
From the economic point of view, the consumption structure refers to the proportional relationship between people’s consumption of various types of materials [8], which can also be divided into two physical forms: value form and entity form, entity form mainly refers to the specific amount of materials consumed by consumers, while value form refers to the proportional relationship of consumption. At the same time, industrial upgrading mainly refers to the optimization and upgrading of industrial institutions, whose purpose is to improve the efficiency and promote the development of key industrial technologies [9]. Industrial upgrading reflects the promotion of production efficiency of coordinated development of industrial institutions [10], that is, the optimization and coordination of various elements in industrial production.
2.2. Relationship between Consumption Structure and Industrial Upgrading
First of all, the optimization of the consumption structure and industrial upgrading are interactive processes. The optimization of residents’ consumption structure can increase demand, thus affecting supply and demand of market, and adjusting industrial output according to market demand, which has an impact on industrial structure adjustment [11]. Second, in the process of China’s industrial development, the quality of industrial technology is constantly increasing. Consumers are also gradually leaning towards highly technical products. Third, the consumption structure and industrial structure can promote each other to jointly improve the change of the consumption structure, which can advance the reform of the industrial structure and finally promote the upgrading of the industry, pointing out the direction for the change of the consumption structure. Although there is a mutually promoting relationship between the consumption structure and the industrial structure, the change of consumption structure is the basis of industrial upgrading, and it also reflects the relationship between social supply and demand [12]. Because the change of consumption structure precedes the revolution of industrial structure, it can provide guidance for the change of industrial structure. Therefore, the upgrading of industrial structure must be based on consumption structure [13], which can affect the relationship between market supply and demand structure. In the process of product upgrading, the consumption structure will change accordingly. Moreover, the adjustment of production strategy will be promoted driven by economic benefits. Under the effect of price guidance and industry association, more capital investment will be attracted in the production field, leading the adjustment of industrial structure and the overall upgrading of industry [13].
2.3. Impact of Optimizing Consumption Structure on Industrial Upgrading under Big Data
Big data of the urban consumption structure comes from data exchange and integration that generated during the operation of many physical facilities and human activities in cities [14]. With proper processing and analysis technology, these data can be used to infer the operation status of commodity sales, production plans, economic development trends, and various complex relationships. Therefore, city data not only lays the basic information for understanding the industrial upgrading and development of the whole city [15] but also plays a central role in promoting urban intelligence. The perception layer detects and obtains urban data through sensors. While the network layer focuses on the unified network construction and information fusion, the data layer sorts out a large amount of data generated by the information systems, thus generating the urban public database platform layer, which mainly includes various public information platforms, cloud computing facilities and big data analysis processing platform [16] .
The application of big data is accelerating the upgrading and integration of information technology and industrial structure, which gives birth to new formats, thus further opening up the development space of the information technology industry [17]. Therefore, in the big data environment, through the prediction of residents’ daily necessities consumption, the coordination of consumption structure can be motivated, which ensures industrial upgrading and development and forms a better industrial development model. Through industrial upgrading, the cost of products can be reduced, so as to tap the consumption potential of residents, reduce the relative price of commodity, and effectively promote the upgrading of industrial structure.
3. Overview of Related Theories in Combination Model
3.1. Integration of Boosting and Bagging
Boosting is a method of combining several weak classifiers into a strong classifier [18]. Its general idea is that it is relatively simple to select several models for the original training set which is called a weak classifier. Each classifier has its own weight, which determines its influence in the combined model. In the process of classification. Each classification will increase the weight of the wrong data of the last weak classifier a little, and the next weak classifier will focus on the samples with greater weight when classifying, that is, the wrong samples of the previous classifier. After several iterations, the samples are improved for the wrong data each time. Finally, all classifiers can be accumulated to obtain excellent classification effect [19], and its flow chart of algorithm is shown in Figure 1.
It can be seen from the figure that with the increase of iteration rounds of the model, the weight of classification error results in each round will increase continuously, which means that the model will pay more attention to the treatment of errors. When the weak classifier iterates, it is necessary to set the loss function and the method of minimizing the loss function. The loss function describes the deviation degree of the model [20]. That is, the deviation between the predicted value and the actual value. Both of them determine the direction in which the algorithm is constantly optimized and revised according to the last result in the iterative process. Different loss functions and minimizing loss function methods have different boosting effects when facing different scenarios.
The biggest difference between bagging and boosting is that bagging randomly selects training sets, and the training sets of each round are independent of each other, while the weights of the training sets in boosting will be modified according to the training results of each round, and there is a dependency relationship between the training sets before and after [21], so each round of bagging training can be carried out in parallel, while Boosting needs serial iteration. Therefore, the baggingbased model has obvious advantages in training speed, but the space cost is relatively high.
3.2. Theory of Random Forest
Because a single regression tree has some problems [22] such as low prediction accuracy, weak generalization ability, the method of improving prediction accuracy by combining multiple models has become the mainstream method. The combination method uses certain sampling methods to generate training sets for multiple models, and the models do not influence each other, so they are trained by using their own training sets. The final result of prediction can be obtained by voting on the results generated by multiple models (when classifying problems) or averaging (when numerical prediction problems occur) [23]. At present, there are many researches and applications of random forests on classification problems, but few applications on regression problems. Therefore, the consumption structure of daily necessities of residents is essentially the prediction of numerical continuous variables which belongs to the application of random forest in regression problem.
Random forest overcomes some limitations of single tree by randomness when combining multiple trees [24]. The randomness has a great influence on the operation of forest, which is mainly manifested in the following aspects: (1)Depth of trees in forest(2)Evaluation function of node splitting(3)Feature selection during node splitting(4)Generation of training set
3.3. GBRT (Gradient Progressive Regression Tree)
The core idea of gradient boosting is that its final result is the accumulation of all the trees participating in the iteration, so the predicted value of each tree is not a direct result (sales in this paper), but an accumulated amount of sales [25]. First, a loss function needs to be initialized. For the application scenario of this scenario, the loss function should be defined as the rootmeansquareerror which can be said that the loss function points out the direction for the iteration of GBRT. Therefore, in this paper, every iteration of GBRT is used to optimize the loss function, and every iteration is selected to minimize the root mean square error (RMPSE) of the model prediction results. Compared with the random forest, GBRT atones and improves the following points: (1)The prediction accuracy of GBRT does not depend on the limitation of the performance of a single learner in the combined model. Even if the performance of the learning machine is poor, it can still approach the ideal result through several rounds of iteration, but the convergence speed of the model is relatively slow(2)In the iteration process, the poor result of a tree can still be repaired by the following iteration. By controlling the step weight of each iteration, overfitting can be avoided(3)The training process of GBRT is a serial iteration, and the space requirement is not high(4)GBRT will generally train until the model converges, and theoretically, it can achieve the prediction accuracy beyond the random forest
3.4. Application of Data Mining in Consumption Structure of Daily Necessities
The application of data mining technology in optimization of industrial organization is mainly divided into three aspects, one is the classification, the other is the regression, and the third is the association rule analysis [26]. The consumption forecast of daily necessities usually has two goals [27], one is the forecast of sales trend about daily necessities, that is, whether the future sales situation will rise or fall. The transportation problem is similar to the classification problem in data mining which can be converted into a twocategory question of whether the scale of sales is rising or falling. The other is the prediction of daily necessities sales, that is, the specific sales value of a certain quarter or a certain day in the future. This problem is similar to the regression problem in data mining, and can be converted into the regression of historical sales data. Another kind of problem is to analyze the shopping of users. The goal of this kind of problem is to find the implicit rules between different commodities purchased by costumers, that is, when purchasing commodity A, they will also buy commodity B, and the problem belongs to the category of association rule analysis.
4. Prediction and Analysis of Daily Necessities Consumption Based on Rossmann’s Store
4.1. Background Analysis
The objective results of consumption prediction of daily necessities are generally divided into two types [28]. One is trend forecast, such as the increase or decrease of business profits which can be abstracted as a classification problem, and the label data can be divided into two categories: increase and decrease. Second, numerical prediction, starting from the problem characteristics of consumption prediction and the target prediction, in the face of different consumption forecasting problems, it is necessary to make a concrete analysis and choose the appropriate forecasting method according to the forecast demand and the form of consumption data. Therefore, there are the following difficulties in consumption forecasting problems: (1) abstraction of prediction, which is analyzed according to the background target data of daily necessities consumption; (2) data preprocessing, where historical data inevitably has the influence of missing value noise data, etc.; and (3) optimization of prediction, in order to get high prediction accuracy results, the model needs to be continuously optimized and improved, and it is often necessary to improve the accuracy by combining models and ensembled learning.
To sum up, for the creation of scheme of consumption prediction in daily necessities, first it is necessary to analyze the content and direction of the problem that to be predicted, grasp the pain point of solving the problem, and then abstract the problem into the basic in the field of data mining; Afterwards, from the perspective of prediction data, take appropriate preprocessing steps to improve the quality of data sets; whereafter, combining with the actual business logic, feature extraction is carried out to expand the dimension of data from multiple angles. Moreover, according to the background and data characteristics, selecting the appropriate prediction method and establishing the training model are also necessary. After the model is established, it should to be continuously optimized according to the results, as well as the results are tested continually. The process is shown in Figure 2.
4.2. Analysis of Data Characteristics
According to the introduction of daily necessities consumption of Rossmann, a wellknown German chain store [29].
To be sure, the store information and training data can be found from the comparison between training data and test data through the store id as an external association. The feature of customer number is obtained by statistics according to the actual number of people who arrive at the store every day in history, and it is unpredictable data like sales, so it cannot be used in the training process, which can only be used as a dimension to analyze the problem. In the test set, represents the corresponding relationship of store id: date, that is, different ids represent different stores: date. From the time span of the test data, it can be known that this problem is mainly aimed at the prediction of future sales.
4.3. Prediction of Consumption in Daily Necessities Based on Combination Model
4.3.1. Prediction Model Based on Random Forest
Compared with other data mining algorithms, random forest has fewer parameters to be determined. The main parameters to be determined are [30]: the number of candidate features selected by decision tree when splitting, number of decision trees in forest , and depth of decision tree in forest Max_depth.
represents the number of decision trees in the random forest, and its size is directly proportional to the complexity of the random forest. When is small, the error of the random forest is large and its performance is not good. When is too large, it can ensure the diversity of trees and improve the quality of results. However, the algorithm construction time will be longer and the forest interpretability will be weakened. Therefore, is selected from 10 to 500 to compare the quality of random forest prediction under different .
In the process of building a random forest, when splitting a node, each decision tree needs to randomly extract a candidate subset containing features from the original feature set and then select the best feature from the candidate feature subset as the basis for splitting the node. The smaller the , the stronger the randomization of the tree and the lower the accuracy. The larger the , the diversity among all trees in the forest will be reduced, so the setting of determines the balance between the performance and diversity of the random forest. In this paper, various factors are weighed, and , where is the total number of features in the original dataset.
The depth of the tree determines the number of features that influence the prediction. The larger the depth of the tree, the finer the sample is divided, and the more features that participate in node splitting, the better the fitting effect of the model in general. While the smaller the depth of the tree, the coarser the granularity of sample division, the less the number of node splitting, only the features with better attributes participate in the splitting, and the generalization ability of the tree is better, whose fitting effect is relatively poor. In order to prevent the trees in the forest from falling into overfitting, Max_dep is also set to 25.
The specific steps of building the random forest prediction model in this paper are as follows: (1)According to the set forest , training sets are extracted from the original data set in a backward way as the input data of each regression tree, and the others are used as the data set for testing the generalization of the model(2)Establish a regression tree for each training set. When each tree selects features at split nodes, randomly select features from all features as candidate feature subsets, and select the optimal features in the candidate feature subsets according to the rule of minimizing the sum of squares of residual errors(3)Test data is applied to multiple regression trees that have been trained. The leaf node of a single tree is essentially a set after dividing the samples of the training set. The prediction is obtained by averaging the samples in the set. The code of the algorithm is as follows

4.3.2. Prediction Model Based on GBRT
In this paper, RMPSE is taken as the loss function of GBRT, and error is taken as the loss function to introduce the construction of GBRT. Assuming that there is store A in the data set, and its sales is 5000 in one day; then, the iterative process of GBRT aiming at minimizing the residual error is as follows: (1)Assuming that in the first iteration, the prediction result of regression tree 1 for store A is , then the residual (2)In the second iteration, the regression tree 2 no longer uses 5000 as the label of store A, but the residual of the first round. Assuming that the predicted result of the regression tree 2 for residual of store A , then the accumulated result of the predicted values of the first round and the second round is equal to the true value of store A, which means the model has completed the correct prediction. If so, assuming that the predicted result of this round of regression tree 2 for store A is , then the residual (3)Same as above, every iteration is aimed at minimizing the residual error, and the learning goal of this round is the residual error of the previous round; The model is iterated repeatedly until the residual value reached by the target is met or the maximum number of iterations set is reached
The realization of GBRT is as follows.

4.3.3. Initialization of GBRT under Random Forest
GBRT also uses the regression tree as the basic learner in the model. In each iteration, the prediction error of the previous tree is taken as the target. There is the concept of contraction step size in the gradient lifting method. By setting the contraction step size, GBRT can avoid the hidden trouble of overfitting due to too fast convergence.
In this paper, random forest is designed to initialize the residual error of GBRT, which provides an initial residual error close to the end point in the right direction for the iteration of GBRT, so that GBRT can converge after a few rounds of iteration, and because random forest has good prediction performance in a moderate precision range, the overall effect even surpasses that of simple GBRT. However, too fast convergence may cause the model to fall into overfitting, an underfitting random forest is selected in the setting of random forest, and GBRT uses a smaller contraction step to iterate on it.
5. Analysis of Results
5.1. Experimental Analysis of Prediction Model Based on Random Forest
The random forest model is built with the help of the Random Forest Regressor module in the machine learning library Scikitlearn. According to the different parameters such as , , the initial sizes of experimental training set and test set are selected as 1 and 0.15, and the results are shown in Table 1:
With , the change of RMPSE with the growth of is shown in Figure 3:
From the results in Figure 3, it can be seen that the prediction accuracy increases with the increase of , but with the increase of to more than 160, the improvement of prediction accuracy tends to be flat. In addition, it can be found that under the condition of the same , the smaller also helps to increase the number of trees in the random forest. At the same time, because the training set used for each tree is random. Moreover, when each tree splits its nodes, the feature selection is accompanied by randomness, so the diversity of trees in the forest increases with the increase of the number of trees, and the effect is better when the final result is formed. This can also explain that the prediction accuracy is improved with the increase of NTRE, and the randomness of selecting subsets is stronger when the is small.
With the setting of , the experimental results are shown in Table 2 in different proportions of the original training set:
According to the above results, we can see that boostrap sampling used in random forest to generate training set has generated diverse sample space for each class tree, and it cannot be significantly improved when generating training set by using some original data sets, and too few samples will reduce the effect of model fitting. Therefore, this paper has weighed various reasons, and selected and . The ratio of training set to original data is the parameter of the final model.
5.2. Experimental Analysis of Prediction Model Based on GBRT
Referring to Python’s th gradient iterative tree tool library XGBoost for modeling and coding the iterative regression tree, some logic of GBRT are reconstructs based on XGBoost library, and an underfitting random forest is selectively added to the initialization of GBRT as its input, and then matplotlib library is used to analyze the importance of variables.
XGBoost library is a concrete implementation of the boosted tree model. With its excellent performance in largescale concurrent design, XGBoost library can achieve several times the training speed of common open source toolkits. matplotlib is a graphic framework of Python, which is mainly used for data visualization in scientific computing. All kinds of graphics can be easily drawn by matplotlib. The iterative process of GBRT is shown in Table 3 when the loss function cannot be reduced in nearly 10 iterations.
From the above table, we can see the state of each round of GBRT iteration, and finally, the model stops at the 400th round of iteration. From the change of RMPSE presented by the training set and the check set, it can be found that the whole model has the phenomenon of overfitting, and the convergence speed of the training set is quite fast. When at the 400th round of iteration, the RMPSE of the training set has dropped to 0.39239, but the RMPSE of the check set is still at 0.91924 which can be concluded that the model has fallen into overfitting in pursuit of convergence, and if the model is put into prediction at this time, it will get unsatisfactory results.
The fitting degree of 1000 rounds of GBRT iteration has reached the accuracy of RMPSE0.0913. However, the model is approaching the final result at a visible speed, and each iteration advances along the gradient descending direction of the loss function on the basis of the previous round, which makes RMPSE decrease continuously and makes GBRT have better accuracy than random forest. And its iterative calculation mode is more clear and controllable in the optimization direction than the combination voting of random forest.
5.3. Results of Initialization of GBRT under Random Forest
In this paper, when the forest scale is 80, the error is used to initialize GBRT, and the iterative process of obtaining GBRT is shown in Table 4:
GBRT initialized by random forest iterate from , it takes about 300400 iterations to reach the same error level, and the error of verification set is bigger, that is, the prediction in actual situation is worse. After 800 rounds of iteration, the accuracy of the model has obviously exceeded that of the pure GBRT model, which means that the training speed of GBRT after initialization by random forest is faster and the optimization error is much better.
6. The Influence of Consumption Prediction of Daily Necessities on the Upgrading and Transformation of Industrial Structure
By combining the random forest with GBRT in Section 4, the residual of the initial GBRT of the underfitting random forest is used to iterate with a starting point, whose training speed of the model is accelerated, and the performance is better than that of the single GBRT model. It benefits from the excellent performance of the random forest in the moderate precision range and its remarkable generalization ability. GBRT can count the importance of global features according to the variable selection of each iteration. Figure 4 shows the importance of features in this iteration.
According to the ranking of feature importance, the consumption of date competitive sales occupies the top three, while the ranking of holidays and other factors is lower, which basically accords with the reality. As a chain store like 711 in China, Rossmann’s store is not sensitive to large holidays but is more influenced by the geographical distribution of similar competitive consumption on weekdays which can be reflected that in recent years, due to the continuous impact of ecommerce online shopping, the attraction of traditional promotion activities to people is not as good as that mentioned before. The conclusions obtained by the model basically accord with its actual operation and have good explanatory power.
To sum up, it can be concluded from the analysis that the network constructed by the time series of daily necessities sales in stores is negatively correlated, the nodes with medium degree tend to be connected with nodes with small degree, while the degree correlation of the network constructed by the time series of daily necessities sales is related to specific commodities, with its consistent correlation. Besides, the model accurately predicts the behavior of daily necessities consumption, which shows that daily necessity consumption can significantly promote the rationalization and upgrading of industrial structure, but it has no significant impact on the quality of China’s industrial structure, and there are differences among different factors. The prediction of daily necessities consumption behavior can effectively promote the improvement of the rationalization level of consumption r industrial structure.
The main driving force of industrial structure transformation and upgrading comes from the adjustment of supply and demand structure and the improvement of technical level and innovation level. From the point of view of characteristic importance, the forecast of daily necessity consumption industry is conducive to eliminating the market and enhancing the market radiation range. The use of big data algorithm not only solves the problem of asymmetric market information but also improves the allocation efficiency of resources. The importance of consumption is in the forefront, which is conducive to enhancing market competitiveness and deepening social division of labor. In addition, the demand for supporting services is increasing, which promotes the transfer of labor force to a certain extent. Finally, for enterprises, the use of big data algorithm can improve the sales of enterprises, reduce the transaction costs of enterprises, and promote the increase of R&D investment.
7. Conclusion
Based on the public data of Rossmann, a wellknown German chain store, and taking the daily necessity consumption forecast as the starting point, a daily necessity consumption prediction model is put forward in this paper based on random forest and GBRT. The results show that the models established by random forest and GBRT algorithm, respectively, have general accuracy after a certain number of iterations, and the residual of GBRT is initialized by underfitting random forest. In addition, the combination of them can effectively advance the training speed of the model, and after 800 rounds of iteration, the accuracy has obviously surpassed that of the pure GBRT model. GBRT after initialization under random forest has faster training speed and better optimization error. Moreover, according to the importance of the characteristics in the model, the establishment of the model is conducive to the rationalization and upgrading of industrial structure. The application of big data algorithm not only solves the problem of market information asymmetry but also improves the allocation efficiency and market competitiveness of resources, which further promotes the rationalization level of the consumption industrial structure.
Data Availability
The dataset can be accessed upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest.