The degree of matching between supply and demand for financial support policies is a key factor for policy effectiveness. In this paper, we use policy text computing method that integrates topic mining, text classification, and training set predictions to study the supply and demand matching of China’s financial support policies for private enterprises. We find that supply and demand match for policies on diversified financing channels. However, there is mismatch in financial service facilitation policies and local subsidy policies. Our research implies that China’s development of a multiple-layer financial market has promoted the diversification of financing channels, which has improved the financing conditions for private enterprises. However, financial service network is still not convenient to facilitate private enterprises.

1. Introduction

Private enterprises play a pivotal role in China’s economic development. Private sector contributes more than 50% of the Chinese economy’s tax revenue, 60 percent of GDP, 70 percent of technological innovation, 80 percent of urban employment, and 90 percent of new firms (https://www.xinhuanet.com/comments/2019-03/09/c_1124214167.htm). However, most private enterprises in China are small and medium enterprises (SMEs) with poor ability to defend against external shocks. Private enterprises in China have opaque financial data and insufficient credit records. Due to information asymmetry and risk aversion, banks are unwilling to lend money to private enterprises. Private enterprises face financing difficulties with higher financing costs. To solve the difficulties of private enterprises, the Chinese government has actively introduced policies to encourage financial institutions to increase credit support for private enterprises. There are a series of policy texts involving fiscal policy, taxation, finance, social security, and other fields, which provide a wealth of textual information about the financial support policies for private enterprises.

Policy texts refer to the documents generated by policy activities, including official documents (such as laws and administrative rules or regulations), official files (such as research, consultation, hearings, or resolutions), and public opinion texts. Text mining technology refers to a general term for various technologies that can obtain implicit knowledge from texts. The research and application of text mining technology emerged in the 1990s. The objects gradually expanded from the initial unstructured texts to semistructured web pages and then further extended to special types of unstructured texts such as patent documents and scientific reports. As text mining technology is applied in many fields and shows great value, the demand for quantitative analysis of policy text gradually increases. Some scholars have begun to apply text mining technology to content analysis of policy texts [1].

From the perspective of research methods, previous policy text studies placed more emphasis on normative analysis than on empirical research and put more emphasis on qualitative analysis than on quantitative research. There were some obvious deficiencies. With the development of data processing technologies such as natural language processing, data mining, and knowledge visualization, large-sample and fine-grained policy text analysis becomes possible [2]. The concepts of computational social science that have emerged in recent years have been introduced into the field of policy research, and new research methods such as text computation, semantic analysis, and sentiment analysis of policies have been developed.

From the perspective of research questions, policy text research mostly focuses on topics such as policy definition and comparison, policy system construction, and policy evolution and development. There are few studies on policy implementation effects and supply-demand matching issues of policies, especially the empirical research on this issue. The policy demanders, as the target object of financial policy, play a key role in the actual effect of financial policies. It is necessary to include the demand side in the policy text research, explore the actual demands of the policy demanders, and evaluate the matching degree of policy supply and demand.

“Private enterprise” is a concept put forward based on China’s unique institutional context. Foreign research does not have such a term as the relevant research is related to SMEs. Chinese scholars have done much research on financial policy support for private enterprises, but less research on the effectiveness of financial policies on private enterprises. On the one hand, our study will enrich the analytical framework of policy text research and expand the analytical perspective on policy matching research. On the other hand, it will provide references for optimizing financial support policies for private enterprises and improving the accuracy of financial policies. In this paper, we study the supply and demand of policies through the LDA model, find the imbalance between supply and demand, and then propose an improvement direction, which provides a more realistic and targeted reference for the optimization of financial support policies for private enterprises in the future.

2. Theoretical Background

2.1. Financial Policy Support for Private Enterprises

Information asymmetry makes it difficult for SMEs to obtain financing through traditional financial channels. Irwin argues that due to the information asymmetry in the lending market, banks require borrowers to have standard financial information records and sufficient collaterals. However, SMEs often lack standard financial information records or suitable collateral, making banks reluctant to lend to SMEs [3]. Levenson and Besley believe that it is difficult for SMEs to obtain financing through traditional financial channels and SMEs mainly get financing through nontraditional financial channels such as small loans, Internet finance, and private equity funds [4].

The government’s financial policy can play an active role in promoting the financing of SMEs. By studying the US government’s experience in financial support for SMEs, Bradford and Chao Chen find that government credit guarantees could play an important role in improving the financing conditions of SMEs [5]. Klasen and Andreas think that tough market environment is a severe challenge for SMEs and the government can help SMEs improve their R&D and innovation capacity by introducing financial support policies [6]. Based on the World Bank’s survey data of 119 developing countries’ enterprises, Wang finds that financing is the biggest obstacle to the development of SMEs and the government’s guarantee could increase banks’ willingness to lend to SMEs [7].

Chinese scholars believe that the government can effectively solve the financing problems of Chinese private enterprises by giving full play to the guiding role of policies. Luo [8] believes that the Chinese government can learn from experience of western countries and solve SMEs’ financing problems by improving relevant laws and regulations, establishing credit system, and constructing a multilevel financial market. Lv [9] suggests that China should carry out a comprehensive reform on the financing system for SMEs to facilitate their credit supply. Hu [10] states that private enterprises face severe difficulties in accessing financial service and the government should effectively solve SMEs’ financing problems by giving full play to the guiding role of policies. Yao and Dong [11] conclude through empirical research that small financial institutions can provide better financial service to SMEs than large financial institutions and the government should support the development of small financial institutions. Dai [12] conducts a study on the financing status of Chinese SMEs under the epidemic and proposes that financial institutions should fully consider the characteristics of SMEs in setting reasonable credit indicators and implement classified policies. By extending or renewing loans appropriately, lowering loan interest rates, and increasing medium- and long-term loans, they can well help to tide SMEs over difficulties.

2.2. Quantitative Text Analysis

After years of development, quantitative text analysis has gradually formed several feasible research methods and basic paradigms, which can be summarized in the following four methods.(1)Content analysis based on mathematical statistics: The model is based on certain theoretical tools, proposing a basic text analysis framework and formulating quantitative standards, using multiperson coding or score by experts to measure the basic units and connotations of the texts from different dimensions. The typical process is that Chinese scholar Sheng and others, in their study of the stakeholders in the innovation policy, introduce the theoretical results of the policy classification tools of Rothwell and other scholars, formulate classification dimensions and content coding standards, and integrate sampling, multiple groups, and multiple rounds. If the confidence level of the results meets the basic requirements, the experiment will draw a research conclusion [13].(2)Literature metrology: We apply traditional bibliometric methods to policy texts and make innovations based on this. Jiang et al. put forward the concept of “Policiometrics,” or the concept of policy metrology, which is regarded as a quantitative analysis method for studying the external and content structural attributes of policy texts [14].(3)Social network analysis: The social network analysis methods of policy texts often merge with bibliometric methods. The difference is that social network analysis methods pay more attention to the integration of graph theory and communication knowledge and focus on exhibiting the implicit relationship network, linguistic relevance, and action relationship implied in policy texts from a macro perspective. This method was first proposed by Girvan and Newman [15], and Wallace et al. effectively applied community discovery in complex networks to document-based research topic identification [16]. Zhang et al. [17] used social network analysis to depict the relationship map of the subjects of policy publications in related fields.(4)Text mining: The previous quantitative analysis methods have problems such as a large amount of manpower, oversimple information extraction methods, and details easily overlooked. Some scholars have begun to try text mining methods. For example, the research of Conway et al. aims to explore the intermedia agenda setting in the 2012 US presidential election and compare the correlation between newspaper and twitter texts [18]. Prior et al. conducted a comparative study on policy documents in the health field in the UK, based on the feature recognition of the narrative structure of the policy texts, combined text mining strategies with semantic web analysis, and revealed the basic elements of the policy text content [19]. Zhang et al. proposed a policy text calculation framework that integrates keyword extraction, topic analysis, and cooccurrence analysis [20]. Some scholars have tried to conduct research on policy sentiment, policy stance, and policy orientation based on the knowledge discovery of deep latent semantics. For example, in the research of Guo and Vargo, they examined the tweets related to US presidential candidates in 2012 and used methods such as semantic network analysis to explore the agenda setting in the network [21].

2.3. Research on the Applicability of LDA Topic Modeling

As an unsupervised machine learning technology, Latent Dirichlet Allocation (LDA) has become the most widely used topic model. Most topic models are based on LDA, such as CTM, Labeled-LDA, and PAM, [22]. The range of application of LDA covers text sentiment analysis, Weibo topic mining, topic tracking, spam comment blocking, knowledge mining, computer vision, and biomedicine. It has gradually extended from academia to industry and has developed into a more mature topic analysis method [23]. For special texts such as policies and regulations, due to the high-dimensional characteristics of the terms, the results obtained by traditional text mining techniques such as classification and clustering based on similarity measurement are not explicable. At the same time, due to its multitopic characteristics, the topic induction method based on word frequency and content structure rules also loses practical value. In contrast, the LDA topic modeling provides a new type of semantic dimensionality reduction and a new method of exploring topic structure and has become a key technology to solve the above two core problems. The LDA topic modeling has a clear hierarchical structure, which can not only map the high-dimensional “document-term” distribution to the low-dimensional “document-topic” and “topic-term” distributions from a semantic perspective. In this way, semantic-based “middle-level features” (i.e., topics) will replace “low-level features” (i.e., terms) to achieve more meaningful text dimensionality reduction. It can also quantify the structure and distribution of text topics and unearth potential semantic relationships that are difficult to generalize from a qualitative perspective, so as to realize the multitopic analysis value of special texts such as policies and regulations in a quantitative manner. For the above reasons, the LDA theme modeling is suitable for this study on the financial support policies for private enterprises [24].

3. Research Methods

3.1. The Overall Research Framework

Based on the policy content analytical method proposed by scholars such as Zhang and Ma [25], we construct an analysis framework that integrates topic mining, text classification, and cooccurrence analysis and conduct an evaluation of policy supply and demand matching mainly through the two paths of topic distribution analysis and predictive topic analysis. The specific process is as shown in Figure 1.

First, we perform LDA modeling and topic extraction on the preprocessed policy texts, determine the subject classification of the policy texts, and classify the policy texts based on the topic according to LDA. Then, we visualize and analyze the topic intensity of the policy texts after LDA classification. Finally, we use the policy text classified by LDA as the training set to predict the topic of the news texts as the demand side, and the policy supply and demand matching assessment is completed.

3.2. Text Mining Methods
3.2.1. LDA Topic Mining

Topic mining is a common method of text mining, which aims to discover abstract topics based on a series of documents. Taking into account the multitopic characteristics of policy texts and the high-dimensional characteristics of terms, we use the LDA model to analyze the topics of policy texts in this study.

LDA is a document generation model, commonly used as a topic modeling for text analysis. The LDA model has simple and appropriate scalability and a solid and rich theoretical basis for mathematical statistics. It is a three-layer Bayesian model. Two hyperparameters are introduced into the model. The so-called hyperparameters are the parameters α and β in the distribution of the parameter x in the Beta distribution [26]. In probability theory, the probability density function of the Beta distribution is defined as

Extending it to the multidimensional case is the Dirichlet distribution introduced by the LDA model.

For each document d, the generation process of the LDA model can be expressed as follows:(1)Obtain a multinomial distribution from the prior parameters, and put this multinomial distribution into the Dirichlet distribution, .(2)A word in the corresponding document d: get a random topic in the topic distribution dimension.(3)A certain multinomial distribution is obtained from the Dirichlet distribution with a prior distribution of β, .(4)Based on the obtained topic , the corresponding word is obtained from the multinomial distribution of the words in the topic.

3.2.2. Text Classification

The main task of text classification is to divide a given text collection into one or more known sets of categories. The core issue is text representation and classification models. There are two main implementation methods: one is text classification based on traditional machine learning, and the other is text classification based on in-depth learning. The former is a combination of artificial feature engineering and shallow classification models, while the latter uses distributed word vectors to represent texts and relies on deep learning models to achieve classification, which improves the accuracy of classification [27].

4. Experimental Analysis

4.1. Data Collection and Preprocessing

To study the policy supply and demand conditions of financial support for Chinese private enterprises and reveal the matching degree of the two, this paper collects relevant policy text data as the text mining data of policy supply and collects news report data as the text mining data of policy demand. Based on the principle of best-effort collection, this paper has obtained 106 policy support documents of the central government and local governments for private enterprises from 2003 to February 26, 2021. The policy documents involve opinions, regulations, decisions, approvals, etc. Using Baidu News as the source of news data and “private” + “finance” as the keywords, we searched for reports on the financial support of private enterprises in recent years and the evaluation of the implementation of supportive policies for private enterprises. Taking into account the time lag of news reports, the search period is from August 24, 2018, to April 9, 2021. The Python web crawler technology combined with manual screening methods is used to obtain 154 news report texts.

We preprocessed the text data. Taking into account the particularity of the policy texts, we perform word segmentation of the original texts and use regular expressions to clean the texts and the stop word database of Harbin Institute of Technology for stop word processing. After the word segmentation is completed, the overall texts and the single texts are measured to obtain multiple statistical values such as weight, word frequency, and information content.

4.2. Results Analysis and Discussion
4.2.1. Overview of the Policy Texts

(1) Time Distribution and Citations of the Policy Texts. As can be seen from Figure 2, from 2003 to 2017, the number of financial policies supporting the private economy was relatively small. In 2018 and 2019, the government has released a number of policies to support the private economy, and various provinces and cities have also introduced a number of supporting policies.

Analyzing the citations among policy texts, we found that the three policy documents issued by the central government in Table 1 have the highest citation rates. It shows that the three financial support policies issued by the central government in 2018 and 2019 have received positive responses from local governments.

(2) Distribution of Policy Text Issuing Agencies. Before the analysis of the LDA topic, an overall structural observation of the policy texts was carried out as shown in Figure 3. From the perspective of institutions involved in financial policies supporting private enterprises, policy documents from the central and local financial departments accounted for 9% of the total documents, and policy documents from provincial government offices and prefecture-level government offices accounted for 39% and 42% respectively, indicating that in solving the problem of financial support for private enterprises, a linkage between the central and local authorities and collaboration of various departments have been formed.

4.2.2. Determination of Number of Topics Based on LDA Modeling

The LDA model is used to unearth policy text topics. Regarding how to determine the number of topics for LDA, a set of recognized guidelines has not yet been created in academia. In this paper, the elbow method is used to determine the value K of optimal number of topics.

The core idea of the elbow method is that as the number of clusters (k) increases, the sample partition will be more refined, and the level of aggregation of each cluster will gradually increase, so the sum of squared errors (SSE) will gradually become smaller. When k is less than the true number of clusters, since the increase of k will greatly increase the level of aggregation of each cluster, the SSE will decrease greatly. When k reaches the true number of clusters, the degree of aggregation obtained by k is increased, so the falling range of SSE will decrease sharply, and then it will tend to be gentle as the k continues to increase. That is to say, the relationship between SSE and k is the shape of an elbow, and this elbow corresponds to the value k, which is the true number of clusters of the data.

It can be seen from Figure 4 that when it is greater than 4, the curve changes much slower; therefore, 4 is the optimal value of k; that is, the number of topics in the policy texts is determined to be 4.

4.2.3. Visualization Analysis of Topics and Feature Words

In this paper, we use the sklearn package to perform LDA modeling on the basis of data segmentation, removal of stop words, and data cleaning, and determine the 4 topics. There are 20 feature words displayed below each subject term in Table 2. They decrease sequentially from left to right and top to bottom according to the probability; that is, the probability of “cities” in topic 1 is the highest, and the probability of “growth” is the lowest.

The visualization in Figure 5 is created through the pyLDAvis package on the basis of LDA modeling, showing topic 3 with the highest probability proportion of policy texts, accounting for 44.5%. The 4 circles on the left represent 4 topics. The number in the circle represents the number of topics with high or low probability. The size of the circle represents the number of documents that contain the topic. The distance between the circles represents the degree of topic relevance. If two circles overlap, this proves that the two topics are highly related. The right half of the figure represents the top 30 terms of the topics, the light gray bar represents the probability that the term belongs to the topic, and the dark bar represents the degree of relevance between the term and the topic.

Through the feature words and corpus summary, it can be seen that topic 1 belongs to the performance evaluation of financial support policies, topic 2 belongs to local government support subsidies, topic 3 belongs to the diversified financing channels, and topic 4 belongs to the financial services facilitation.

4.2.4. News Text Analysis

After initial merging and data processing of the news texts, 5 topics can be obtained through the elbow method, and the classification methods in machine learning with NLPIR-Parser are used to find out the feature words and relevance probability under each topic. It can be seen in Table 3 that the news texts are focused on government departments, with private enterprises as the main body, concentrating on topics such as “monetary policy,” “performance appraisal,” “conferences,” “relevant regulatory authorities,” and “pilots.” Most of the core vocabularies are consistent with the policy texts. 10 feature words are displayed below each subject term, decreasing from left to right and top to bottom according to the probability; that is, the probability of “People’s Bank” in topic 1 is the highest, and the probability of “insurance” is the lowest.

The visual view in Figure 6 indicates that topic 1 has the highest probability of news texts, accounting for 24.7%. Many core words are also consistent with the policy texts.

4.2.5. Analysis of Policy Supply and Demand Matching

From a macro-topic perspective, we get the supply and demand matching of financial policies for private enterprises by comparing the distribution difference between policy texts and news texts on related topics. The policy text categorical distribution can be shown from the LDA distribution results, and the sum distribution is calculated according to the text-topic probability matrix. In news text classification, the results predicted by training model are used to obtain the proportion of text types by subtracting the number of topic texts from the total number of texts. In Figure 7, we can find that policy supply and policy demand have the following characteristics in four types of topics.(1)Compared with other topics, the distribution of diversified financing channels (topic 3) in the policy texts and news texts is very similar, and both are relatively high. This shows that diversified financing channels are not only an important policy demand of private enterprises in recent years, but also an important policy tool sought by the government to solve the problem of private enterprises’ financing difficulties and high financing costs. The policy supply and policy demand related to diversified financing channels show a relatively balanced relationship. In recent years, China has vigorously developed a multilevel financial market, which has objectively promoted the diversification of financing channels. The study of Besley and Levenson shows that nontraditional or informal financial channels play an important role in solving the financing problems of SMEs and Chinese private enterprises [4].(2)The performance of financial support policies (topic 1) and local government support subsidies (topic 2) accounted for higher proportions in policy texts than in news texts. The proportions of these two topics in the two types of texts are quite different, and the proportions of both topics in the news texts are very low, indicating that the government departments have focused more on the policy performance evaluation and subsidy policies. However, they are not the focus of market attention in the short term, and they have not yet received enough attention and discussion from the market.(3)Financial service facilitation or convenience (topic 4) accounts for significantly higher proportions in news texts than in policy texts, and financial services facilitation accounts for a very high proportion of news texts. On the one hand, this may indicate that the policies on financial service facilitation may arouse great market attention. On the other hand, this also shows that private enterprises have the highest demand for financial service facilitation policies. This shows that the policy’s attention to and promotion of financial service facilitation are still significantly lower than private enterprises’ demand for financial service facilitation. Stranhan and Weston find that banks gradually prefer loans to large enterprises and reduce loans to SMEs as they grow due to economies of scale [28]. Lin argues that small and medium financial institutions have the advantage of geographic contacts in collecting and processing “soft information” and can provide relatively concentrated loans for SMEs. As China’s financial system is still dominated by large state-owned banks, small- and medium-sized banks account for a relatively low proportion of financial assets, and the financial system’s service facilitation network for private enterprises is still incomplete [29].

5. Conclusion

Based on policy texts and news texts, this paper analyzes the supply-demand matching issue of financial support policies for private enterprises. First, we show that the Chinese government has introduced many financial support policies referring to diversified financing channels, policy performance evaluation, local government subsidy, and financial service facilitation to solve the financing problem for SMEs. Then, we use the policy text computing method of topic mining, text classification, and training set prediction to analyze the supply and demand matching of those financial support policies from the macro and micro perspectives through topic distribution analysis.

The following conclusions are drawn from the study in this paper.

First, policy performance evaluation has gotten much attention from the Chinese government, but it has not yet received market attention, and there has been lack of social supervision on the policy performance. In recent years, private enterprises have faced rising challenges with the slowdown of economic growth, the transformation of economic development mode, and the rise of international trade frictions. Due to information asymmetry and risk aversion sentiment, financial institutions may further tighten lending supply to private enterprises. The policy performance evaluation can help solve the problem of incentive mechanism.

Second, local government subsidy is an important policy to solve private enterprises’ difficulties in accessing finance. Private enterprises generally have problems such as irregular operations, opaque finances, and lack of credit records. It will be very difficult to obtain loans from banks. Private enterprises have lower credit ratings and higher financing interest rates. With more comprehensive information, the government can set up a public guarantee mechanism to provide different degrees of support and subsidies to private enterprises with different qualifications. At present, the proportion of local government support subsidy in news texts is much lower than that in policy texts, indicating that there are still obvious shortcomings in the implementation and actual effects of related policies.

Third, financial service facilitation or convenience will help private enterprises get access to financing more efficiently and easily. In response to the higher demands of private enterprises for the facilitation of financial services, government departments need to address current problems in a targeted manner. It is recommended that they improve the financial system, vigorously develop small- and medium-sized financial institutions and venture capital institutions, enhance the facilitation of financial services, encourage the innovation of financial products, and actively explore and develop innovative products and services that meet the characteristics of private enterprises, production cycles, and industrial characteristics.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This study was financially funded by the project of the National Social Science Foundation in 2019 (19BJL059).