Scientific Programming and Artificial Intelligence for Sensor Data Stream AnalysisView this Special Issue
Analysis of Economic Development Trend in Postepidemic Era Based on Improved Clustering Algorithm
In order to explore the economic development trend in the postepidemic era, this paper improves the traditional clustering algorithm and constructs a postepidemic economic development trend analysis model based on intelligent algorithms. In order to solve the clustering problem of large-scale nonuniform density data sets, this paper proposes an adaptive nonuniform density clustering algorithm based on balanced iterative reduction and uses the algorithm to further cluster the compressed data sets. For large-scale data sets, the clustering results can accurately reflect the class characteristics of the data set as a whole. Moreover, the algorithm greatly improves the time efficiency of clustering. From the research results, we can see that the improved clustering algorithm has a certain effect on the analysis of economic development trends in the postepidemic era and can continue to play a role in subsequent economic analysis.
After more than 40 years of rapid development of reform and opening up, my country has entered the new economic normal, and the transformation of economic growth mode and supply-side reforms also need to slow down and shift gears. Moreover, due to the global economic downturn and trade frictions, the economic growth rate in recent years has shown a downward trend. In addition, the COVID-19 epidemic, as a global “black swan” event, has had a greater impact on my country’s economy and the development of the world economy.
The sudden outbreak of the COVID-19 epidemic in 2020 has inevitably brought countries around the world into a state of stagnation. The current domestic epidemic prevention and control has achieved significant strategic results, but the overall economy has not yet returned to the normal level of previous years. The spread of the epidemic abroad is still continuing, but countries have begun to relax blockade restrictions and gradually restart their economies. This has played a positive role in stabilizing global economic confidence, but at the same time, it has also increased the difficulty of epidemic prevention, aggravated the uncertainty of the world’s economic development prospects, and brought severe challenges to the development of my country’s open economy.
The influence of the COVID-19 epidemic on our country economy and the economic recovery after the epidemic are the core topics of current scholars’ attention. Academia generally believes that, during the SARS period, China is in the rapid development stage of urbanization and globalization, and there is relatively sufficient space for fiscal and monetary tools, so the effect of the SARS on the economy is relatively small .
The COVID-19 epidemic has had a huge impact on the Chinese economy and the world economy. This shock overturned the cognition of traditional economics, exposed some shortcomings in the economic structure, provided us with a window to reunderstand the Chinese economy and the world economy, and provided new momentum and ideas for China’s economic transformation. In this regard, we need to theoretically summarize and explore the causes and mechanisms of economic recession under the “abnormal” epidemic and the relationship between the “normal economy” and the “abnormal economy.” At the same time, we need to deeply explore the deep-seated structural problems and contingency mechanisms of the Chinese economy and the world economy. In addition, it should be recognized that this shock has the characteristics of sudden and temporary nature, and the economic loss it causes is a kind of “sunk” loss, which does not affect the foundation of China’s economic development and cannot change the basic pattern of China’s economic improvement. The length of the impact depends on the transmission cycle of the epidemic. With the end of the epidemic and the momentum of economic development still returning, China’s economy will naturally enter a normal track of development. Moreover, while the epidemic has a negative impact on the economy, it also stimulates some opportunities or positive effects. In this regard, we need to fully understand the negative effect of the epidemic on China’s economy and at the same time fully explore opportunities to turn crises into opportunities, passivity into activeness, and passivity into proactiveness so as to win new development momentum, promote the transformation of China’s economy into a “disaster-adapted economy,” and achieve sustainable and high-quality development of China’s economy .
The impact caused by SARS is far less than the impact of the new crown. The latter has greater challenges and more and more profound issues worthy of research. At present, the existing research focuses on empirical research and rarely understands the impact of the epidemic and the mechanism of economic recession from the economic theory. In the study of countermeasures, most of them are on the matter, and they have not understood the economic transformation problems under the epidemic disaster from the deep level of economic operation.
At present, the impact of the epidemic on the global economy is still developing and evolving; especially, the unemployment wave caused by the epidemic is becoming a major problem facing all countries. Moreover, with the emergence of the global unemployment problem, residents’ consumption willingness and consumption preferences will change. These will further trigger a decline in consumer demand, cause corporate profits to decline, production and operation difficulties, suspension of recruitment plans, and even implementation of layoff plans, and further aggravate the complexity and severity of the employment situation .
This paper combines the improved clustering algorithm to analyze the economic development in the postepidemic era and proposes the direction of subsequent economic development.
2. Related Work
Literature  puts forward the argument that regional income levels can eventually converge with economic growth under the assumption that the factors are completely liquid. Literature  believes that, under the assumption of free flow of production factors and an open economy, as the regional economy grows, the gap between countries or different regions within a country will shrink, and regional economic growth will converge in regional space.
The circular causality theory of literature  believes that the role of the market tends to expand the regional difference rather than diminish the regional difference. Once the difference appears, the developed regions will obtain competitive advantages, thereby containing the undeveloped regions, which is not conducive to economic development. Literature  believes that the interregional imbalance of growth is inevitable, and the development of the core area will drive the development of the peripheral area to some extent through the trickle-down effect. However, at the same time, the inflow of labor and capital from the periphery to the core area will strengthen the development of the core area, which in turn plays a role in widening the regional gap, and the great effect plays a dominant role.
Literature  believes that the regional economic difference is primarily determined by aspects such as the capital investment rate of each region, the growth of employment people, human capital investment, foreign investment, and the location of each region. Literature  believes that location factors, macroeconomic schemes, economic structure, urban-rural economic development difference, population quality, and market economy development level are all factors that affect my country’s regional economic gap. Literature  believes that, at the beginning of the reform, the per capita income, urbanization and industrialization levels in the eastern district, the central district, and the western district are different, which lead to distinct growth rates during the development process and ultimately make the income growth rate of residents in the eastern district faster than that in the central and western districts. In addition, enormous scholars have explained the phenomenon for regional gaps from other aspects. Literature  believes that regional economic differences are largely caused by location differences. Literature  believes that institutional factors are also the cause of the gap in economic development of regional in my country. After the all-round reform and opening up, the country implemented the eastern guide policy, which led to rapid economic development in the eastern region. Therefore, it believes that such institutional factors are the main reason for the economic development difference between the east district and the west district.
Based on the least square method, literature  made a mid- and long-term forecast of Beijing’s economic development prospects. Based on the semiparametric regression theory, literature  carried out a predictive analysis on the relevant economic indicators of Henan Province and compared it with the prediction error value of the linear autoregressive model. The comparison result shows that the semiparametric autoregressive short-term prediction effect is better.
Literature  derived and constructed a homologous gray prediction model with one variable and a first-order equation (written as HGEM (1, 1)) based on the gray system theory to predict the total energy consumption of China’s manufacturing industry. The experimental results prove that the gray system model does not require a large number of data samples, and the prediction error is small, which can effectively reflect the true status of the gray system. However, although it can reveal the regularity of sample information, it cannot fully reflect the interference of various unconventional social factors on the predicted objects.
Literature  developed and applied artificial neural network (ANN) model with backpropagation learning (BP) algorithm and traditional extreme learning machine (ELM) to predict GDP growth rate. Literature  used artificial neural networks to make ecological predictions of the atmospheric state of industrial cities and provided an assessment of the adequacy of the prediction model based on the calculation of the correlation coefficient between the data and the reference data. Literature  studied the future economic growth trend based on the BP neural network model. Literature  improved the economic forecasting algorithm based on the BP neural network algorithm to improve the forecasting accuracy.
Literature  established a combined forecasting model based on the ARIMA model and neural network algorithm to study GDP series. The results show that the accuracy of the ARIMA-NN combined model is greater than that of any single model. Literature  studied the weight selection method of the combination model and established a combination prediction model based on the reciprocal residual method, a combination prediction model based on the reciprocal variance method, and a combination prediction model based on the least square method. Moreover, it tested the prediction accuracy of these three combined models based on historical data of national GDP per capita.
3. Textual Representation of Economic Data
In text mining, the text is usually represented by the vector space method; that is, a certain number of representative feature words are regarded as one dimension in the vector space. To study text clustering, we must first establish a mathematical model of the text, so that appropriate methods can be used to quantitatively calculate the similarity between texts. The text representation model is the feature representation of the document. Feature representation refers to the use of some important feature items (words) to represent documents. In the mining algorithm processing, only these feature items need to be processed. This is a process step of converting unstructured data to structured data.
At present, the representative text representation models include the Boolean model and vector space model. These models proceed from different perspectives and use different methods to deal with issues such as feature weighting, category learning, and similarity calculations. The Boolean model can only be used to calculate the relevance of user queries and documents in information retrieval but cannot calculate the deeper similarity between two documents.
The traditional document representation model is the VSM vector space model. The VSM model uses the terms in the document to represent the feature vector of the document, and the term frequency-inverse document frequency is used as the weight. For example, the vector of a document is represented as , and in the vector V represents the weight of the i-th feature. The weight can be calculated by the TF-IDF algorithm. The main idea of the TF-IDF algorithm is as follows: if the frequency TF of a certain term or phrase in a document is high and the IDF of the term or phrase in other documents is low, then the term or phrase can distinguish the document from other documents.
Term frequency refers to the frequency of occurrence of term i in a document. The calculation formula of is as follows :
In the formula, represents the term frequency of the i-th feature in the feature vector, and the denominator represents the total term frequency of all features in the feature vector.
Inverse document frequency (IDF) refers to a measure of the universal importance of a term and is the number of documents containing the term i in other documents. The calculation formula of is as follows :
In the formula, represents the number of documents in which the i-th feature item appears in all document sets, and N represents the number of all documents in the document set. The calculation formula of is as follows:
The degree of correlation between two documents is often measured by the similarity between them. When the text is expressed as a vector space model , the similarity between the texts can be expressed by calculating the distance between the vectors. At present, the calculation of similarity between and mainly includes cosine similarity , Euclidean distance , and Pearson correlation coefficient , as shown in the following:
BIRCH algorithm is a multistage clustering algorithm based on clustering feature (CF) and clustering feature tree (CF tree). It can use limited memory resources and consumption to complete high-quality clustering of large-scale data sets.
In the vector space, N data points , of a class are given. The centroid , radius R, and diameter D of this class are defined as follows :
In the formula, R is the average distance between all data points in a class and the center of mass, and D is the average distance between any two points in the class. These two items can reflect the tightness within a class and are usually used in the BIRCH algorithm to determine whether the size of the class meets the limit of the threshold radius T.
If the centroid points and of two classes are known, the formulas for the Euclidean distance and Manhattan distance of the two centroid points are as follows:
We are given two classes and . Among them, the class contains data points . The other class contains data points . Then, the calculation method of the distance between the two classes is as follows:
In the above formula, , , and can represent the relationship between the two classes. In the CF tree reconstruction process, these distances can be used to calculate the distance between entries when the CF tree node is split so that the two entries with the farthest distance can be used as the two root entries of the split node and other entries can be classified.
There are two core concepts of CF and CF tree in the BIRCH algorithm. The clustering feature CF represents a class in the form of a triple, as shown in Definition 1.
Definition 1. (clustering feature CF). When N d-dimensional data points , of a class are given, the cluster feature CF vector is defined as a triple:Among them, N represents the number of data points in this class, and represents the linear sum of N data points in the class; namely,SS represents the sum of the variance of N data points in the class; namely,
Theorem 2 (clustering features additive theorem). If it is assumed that the CF values of the two classes are and , then the CF of the class after the fusion of these two classes meets the following formula:
The additivity theorem of clustering features shows that it is easy to obtain the CF vector of the fusion class from the CF vector of different classes. Moreover, according to different types of CF vectors, it is easy to obtain the corresponding , R equidistance formulas. According to these distances, the effect of the clustering result of the BIRCH algorithm can be judged.
As shown in Figure 1, the CF tree involved in the BIRCH algorithm is a highly balanced tree with two parameters: branching factor B and threshold T. Each nonleaf node contains at most B branch entries , is a pointer to its i-th child node, and is the clustering feature of the i-th child node. A leaf node has at most L entries, and the form of each entry is , where is the clustering feature of its i-th subcategory. In addition, in order to improve the query speed, all leaf nodes are connected through two pointers, prey and next. The CF value of a leaf node represents the clustering characteristics of a subcategory. Among them, the radius of each class must be less than the threshold limit T.
In the BIRCH algorithm, a CF tree is a compressed representation of the density statistics of a data set, and an entry in the leaf node represents a clustering feature CF value of class . Among them, the radius R of each class C must meet the limit of the threshold radius T. The denser the area in the original data set, the more data points contained in class C. Conversely, the sparser the area, the fewer data points contained in class C.
In the clustering process of the BIRCH algorithm, as data points are continuously added, the CF tree is dynamically constructed. This process is similar to the insertion operation of the B tree. The construction process of the CF tree is as follows:(1)Starting from the root node, select the nearest child node from top to bottom.(2)After reaching the leaf node, check whether the class C represented by the nearest entry can absorb this data point. If possible, update the CF value; if not, check whether it can add a new entry. If it is not possible to add a new item, the split farthest∼pair is used as a seed, and other items are reallocated according to the distance.(3)Update the CF value of each nonleaf node. If the node is split, insert a new entry in the parent node, and then check whether the parent node needs to be split until the root node.
When the data points cannot be inserted, at this time, the threshold T needs to be raised and the CF tree is rebuilt to absorb more data points until all the data points are inserted. The size of the threshold T determines the size of the class so that the size of the CF tree can be controlled by controlling the size of T to adapt to the current limited memory. If T is too small, the number of classes will be very large, resulting in an increase in the number of tree nodes, which may lead to insufficient memory in the data set before the scan is completed, so it needs to be adjusted appropriately during the process of inserting data points. The size of the threshold T makes it suitable for the current memory size.
The clustering process of the BIRCH algorithm is shown in Figure 2. The specific process is as follows: Stage 1: use limited memory and hard disk space to scan all the data in the data set and initialize a CF tree in the memory. This CF tree can reflect the clustering information of the data set as much as possible under the condition of memory limitation. Divide dense data into classes, and eliminate sparse data points as outliers. Stage 2: this stage is optional. The input data scale of the global clustering algorithm in stage 3 has certain limits, and the result in stage 1 may be different from the input scale required in stage 3. Therefore, in stage 2, the CF tree in phase 1 can be modified to generate a smaller CF tree, so that the global algorithm in stage 3 can run more effectively. Stage 3: in this stage, a global or semiglobal algorithm is used to cluster the entries of the leaf nodes in the CF tree to generate better global clustering results. Stage 4: this stage is optional. Redistribute data points to the nearest seed to ensure that duplicate data are grouped into the same class, and add class labels at the same time.
In order to evaluate the clustering accuracy and effectiveness of the improved algorithm, this paper uses the improved algorithm proposed in this paper to implement the algorithm on two simulation data sets and one news data set, respectively. Moreover, this paper uses running time T, precision P (Precision), recall R (Recall), and F-measure (F-measure) to further evaluate the proposed algorithm. When F is higher, the clustering accuracy is higher.
Here, the class identified by the clustering result is referred to as the result class, and the primitive class in the original data set is referred to as the original class. F-measure combines the ideas of precision P and recall R in information retrieval to perform clustering evaluation. The precision and recall of a result class j and the corresponding original class i are shown in the following formula:
In the formula, is the number of all objects in the original class i in the result class j, is the number of all objects in the result class j, and is the number of all objects in the original class i. The definition of F-measure of primitive class i is as follows:
For the original class i, the higher the F-measure value of the clustering algorithm is, the better the effect of the clustering algorithm is and the more it can reflect the mapping of the original class i. In other words, F-measure can be used as the evaluation score of the original class i. For the clustering result, the total F-measure of the algorithm can be obtained by the weighted average of the F-measure corresponding to each original class i, as shown in the following formula:
In the formula, is the number of all objects in the original class i.
4. Economic Analysis Model Based on Improved Clustering Algorithm
This paper combines the improved clustering algorithm to make an economic analysis model and analyzes the economic development trend in the postepidemic era through this model. This paper uses textual economic data analysis for clustering. The specific description and process of the algorithm (Figure 3) are as follows:
(1) We segment the background technology and technical fields of the patent text, filter out stop words, and extract problem keywords that describe the background of the problem, the existing problems, the conditions for the problems, and the existing solutions to the problems. (2) We establish the index item of patent documents-document matrix D. (3) We use LSA latent semantic analysis to reduce the dimensionality of the matrix. (4) We use the DBSCAN algorithm to cluster the document matrix VKT text.
The centroid points of all classes are summarized to form a new compressed data set, and the CF value of each class is stored as the attribute value of the centroid point of the class. This compressed data set will be used as the input data set for the next stage of the AV-DBSCAN algorithm. In this newly generated compressed data set, each data point represents a class of the clustering result. The process of generating compressed data sets is shown in Figure 4.
The category update process of the original data set using the clustering results of the AV-DBSCAN algorithm is shown in Figures 4–6. When AV-DBSCAN clusters the compressed data set, each data point in the clustering result of the compressed data set corresponds to a subcategory in the original data set. According to the category of the data point in the compressed data set, the corresponding subcategory in the original data set is found, and all the data points in the subcategory are updated to this category. Finally, the clustering result of the original data set is obtained.
On the basis of the above analysis, the constructed economic development trend analysis model based on the improved clustering algorithm is shown in Figure 6.
On the basis of the above analysis, an economic development trend analysis model based on the improved clustering algorithm is constructed, and then the performance verification analysis of the model can be performed.
5. Analysis of Economic Development Trends in the Postepidemic Era Based on Improved Clustering Algorithms
At the end of 2019, a case of 2019-nCoV infection was detected in Wuhan, Hubei Province, China. The virus spreads quickly and is difficult to prevent and control, and there is a phenomenon of human-to-human transmission. Afterwards, confirmed cases of pneumonia infected by the 2019-nCoV virus gradually increased, and the COVID-19 epidemic broke out in Wuhan and spread rapidly across the country. On January 31, the World Health Organization announced that the COVID-19 epidemic constituted a “public emergency of international concern.” In the face of the epidemic, the Chinese government attached great importance to it, acted quickly, and achieved decisive results in about three months, and the epidemic was effectively controlled. At the same time, the COVID-19 epidemic continued to spread in other countries, and the death toll increased rapidly. The World Health Organization pointed out in the “Declaration of Healthy Recovery from the Covid-19 Epidemic” that the 2019-nCoV virus epidemic is the world’s biggest impact in decades, and hundreds of thousands of people have lost their lives. Moreover, it pointed out that the world economy is likely to face the most serious economic recession since the 1930s, and the resulting unemployment and reduced income will be detrimental to livelihoods, health, and sustainable development.
In this context, economic development has received a major impact, so this paper uses the constructed improved clustering algorithm to analyze economic development trends in the postepidemic era and verify the performance of the system.
The model constructed in this paper can effectively mine relevant data in the process of economic development. Therefore, in the process of model verification, this paper first analyzes the effect of factor mining, mines 78 sets of data, analyzes the effectiveness of these factors, and scores the factors. The results are shown in Table 1 and Figure 7.
From the above analysis, it can be seen that the clustering model constructed in this paper has a good performance in economic factor mining. On this basis, this paper analyzes the economic development trend analysis strategy and evaluates the output results. The results of the evaluation statistics are shown in Table 2 and Figure 8.
The summary of the output results is as follows.
At present, the Party Central Committee and the State Council attach great importance to achieving the annual economic and social development goals and tasks and have issued a series of supporting policies to coordinate the prevention and control of the new crown pneumonia epidemic and economic and social development. All departments and governments at all levels must fully implement the various decisions and deployments of the central government, coordinate and promote related work, and ensure precise implementation of policies and effective implementation of policies.
We need to focus on fiscal policy, supplemented by monetary and credit policies, to effectively support economic recovery. At the same time, we need to increase countercyclical adjustments to fiscal policy, increase the central fiscal deficit rate, and increase central transfer payments to local governments. We will increase targeted support to weak areas and regions such as medical and health, small, medium, and microenterprises, private enterprises, service enterprises, and residents who are unemployed and at risk of unemployment. Through tax reduction and exemption, direct subsidies, and so on, the goal of protecting people’s livelihood can be achieved. Increase assist for the construction of short-term livelihood areas such as medical and health, education, public transportation, and pollution prevention to hedge the economic decline and employment problem caused by the epidemic. The cautious monetary guideline is moderately relaxed at the margin, the monetary policy transmission mechanism is unblocked, the assessment of private enterprise loans issued by financial institutions is strengthened, and financial institutions are encouraged to reduce interest rates, improve loan renewal policies, increase credit loans and medium- and long-term loans to ensure that private enterprise loans are maintained certain growth, effectively alleviating the problem of tight capital chain of private enterprises, and beware of the large-scale explosion of private enterprise credit risk.
Firmly implement the spirit of the Fourth Plenary Session of the 19th Central Committee of the Communist Party of China, fully implement the new development concept, focus on supply-side structural reforms, accelerate the construction of a modern economic system, and effectively solve the deep-level structural and institutional problems facing the Chinese economy through comprehensive deepening of reforms. Inject new impetus into economic growth.
In the postepidemic era, the current negative impact of the COVID-19 epidemic on the economy has fully emerged, and there is also the possibility of secondary risks through high-risk companies and residents. Therefore, we must adhere to the combination of epidemic prevention and control and economic support, play the role of countercyclical policies, and effectively reduce the negative economic impact of the epidemic.
This paper combines improved clustering algorithms to analyze economic development trends in the postepidemic era, constructs intelligent models through intelligent algorithms, evaluates economic development effects based on actual conditions, analyzes current problems through factor mining, and proposes corresponding strategies.
This paper designs experiments to detect the performance of the algorithm model constructed in this paper. From the research results, we can see that the improved clustering algorithm proposed in this paper has a certain effect in the analysis of economic development trends in the postepidemic era and can continue to play a role in subsequent economic analysis.
The labeled data set used to support the findings of this study is available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
This study was sponsored by Support Plan for Scientific and Technology Innovation Talents in Universities of Henan Province (Humanities and Social Sciences) (2019-cx-019); Major Research Projects of Philosophy and Social Sciences in Universities of Henan Province (2022-YYZD-05); Henan Philosophy and Social Science Planning Project (2021BJJ032); and National Social Science Foundation of China (20BGJ016).
J. Thomä and H. Chenet, “Transition risks and market failure: a theoretical discourse on why financial models and economic agents may misprice risk related to the transition to a low-carbon economy,” Journal of Sustainable Finance and Investment, vol. 7, no. 1, pp. 82–98, 2017.View at: Publisher Site | Google Scholar