Research on the Evaluation Model of Rural Information Demand Based on Big Data

Jin, Yanfeng; Li, Gang; Wu, Jianmin

doi:https://doi.org/10.1155/2020/8861207

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Learning Methods for Urban Computing and Intelligence

View this Special Issue

Research Article | Open Access

Volume 2020 | Article ID 8861207 | https://doi.org/10.1155/2020/8861207

Research on the Evaluation Model of Rural Information Demand Based on Big Data

Yanfeng Jin,^1,2Gang Li ,¹and Jianmin Wu²

Academic Editor: Bingxian Lu

Received17 Jun 2020

Revised07 Jul 2020

Accepted20 Aug 2020

Published08 Sept 2020

Abstract

In recent years, the imbalance of rural information supply and demand has seriously hindered the process of rural informatization. Rural information demand is a decisive factor in the relationship between rural information supply and demand. Therefore, research on the influencing factors of rural information demand has attracted much attention. The traditional rural information demand factor analysis does not consider the correlation between factors. The factors themselves carry a lot of repeated information, which seriously interferes with the objectivity of the analysis results. Proceeding from the complexity and diversity of influencing factors of rural information demand, based on the selected subjective and objective factors, based on the forward partial correlation analysis and post-ROC test, a probit discriminant model of influencing factors of rural information demand was constructed, and the relationship with Lingshou was determined. There are 24 factors that are significantly related to county rural information needs. The research results show that this method not only eliminates the factors that carry highly repetitive information and the correlation is not significant but also makes the results more reliable. At the same time, it also found that rural information supply is related to farmers’ information cognition ability, acceptance awareness, and acceptance ability. This study provides new methods and new ideas for solving related problems.

1. Introduction

With the rapid development of science and technology and the advent of the big data era, the construction of smart cities at home and abroad has made remarkable achievements. At the same time, rural informatization also ushered in new opportunities for development. Digital rural areas and intelligent rural areas have become the hot spots of scholars. Of course, there are great challenges as well as opportunities. Due to the unbalanced development of regional economy, there are many problems in rural informatization. First, the collection, processing, integration, and sharing of rural information are difficult. Second, data mining cannot be carried out effectively and does not provide the information farmers need. Third, there is a contradiction between the diversity of farmers’ information needs and the unity of platform information. Fourth, lack of dynamic maintenance and update mechanism, data outdated, cannot play its due role. This requires big data technology and big data thinking to improve and solve the current difficulties. The application of big data in various industries has achieved good results. The idea of big data has gradually penetrated into the process of rural informatization. With the help of big data technology, we can build a comprehensive rural information platform based on farmers’ information needs.

Whether in developed countries such as Europe and the United States, or in developing countries such as Asia and Africa, there are many studies on the information needs of rural residents. Scholars’ research shows that farmers’ demand for information is more and more extensive, and the types of demand and access channels show a variety of characteristics [1]. Kaniki’s survey of two rural communities in South Africa found that the main information needs of farmers are information needed to seek jobs or increase income, vocational or skills training opportunities, information about grants, medical and health information, legal counseling services, and so on [2]. In Asia, Raju’s study found that the most common information needs of Indian farmers were medical and health information, infrastructure information, crop improvement and yield information, product sales and market information, policy, and service information [3]. Vevrek thinks the daily information needs of the rural population in the United States are information about local government decisions, information about health services, and local news [4–6]. Domestic researchers have found that farmers pay more attention to specialized information related to agricultural production and operation. Zhang Ying (2017) based on the rural information service platform, from the perspective of farmers, found that farmers’ demand for labor market information, agricultural market information, agricultural policy information, and agricultural production information decreased in turn. Li Lu (2016) surveyed the demand for agricultural technology social services and found that farmers’ age, education level, and whether they went out to work would affect farmers’ demand for agricultural technology social services. Young and experienced peasants paid more attention to information services related to the circulation of agricultural products. Zhou Fengtao (2016) studied the farmers’ demand for information services and found that educational level and whether to participate in rural cooperatives had a significant impact on their demand for technical services, agricultural services, and information services. Lu Xinru and Li Zhigang (2017) explored the unique information needs and behaviors of farmers through questionnaires. Farmers’ information demand had three characteristics: the tendency of market purchase and sale information, the necessity of policies and regulations, and the particularity of meteorological forecast. Information behavior was restricted by educational level and the overall channel was narrow. Ma Chunyan (2016) carried out an investigation and research on poverty-stricken areas. From the questionnaire, through the analysis of demand types, information access channels, personal literacy, and other aspects, it provided suggestions and countermeasures for speeding up the development of local agriculture and reversing the backward development situation in remote areas. Pan Yuchen and Huo Yucan (2018) analyzed the concept of rural information consumption, the level of demand, and the motivation of consumption, especially in the field of emotional demand, which was also a further reflection of the demand level theory. Provided guidance for the development of the whole society and related enterprises helped enterprises to improve the pertinence of information services and achieve steady growth. Wang Xiaoning and Wang Ming (2018) empirically analyzed the main channels for farmers to obtain information under the background of mobile Internet by issuing questionnaires. Through the analysis, it was concluded that mobile micromessaging, mobile QQ, and mobile microblogging are the absolute dominant advantages in information dissemination, while agricultural information website platform was not generally known to farmers. Guan Lili (2017) analyzed the information needs and constraints of farmers through questionnaires, especially the five characteristics of local farmers: the increasing variety of demand categories, the diversification of access methods, the depth of demand levels, the strong internal motivation of demand, and the strong ability of information research and judgment. It provided experience in understanding the level of rural informatization and promoting the construction of information frameworks. Cui Kai and Feng Xian (2017) combed and analyzed the relevant literature at home and abroad, and studied the significance of information dissemination, the information needs of rural residents, and the information supply in rural areas. From the perspective of information poverty alleviation, Li Gang and Qiao Haicheng (2017) proposed that the government should pay attention to information poverty alleviation through the construction of rural poverty-stricken area model and analysis of relevant data.

In summary, it has been found from the existing research that the information needs of farmers in China are increasingly strong and the demand structure is increasingly diversified, but the specialized information related to agricultural production development is still the most important component of farmers’ information consumption. Affected by income levels and cultural quality, mass media such as television and broadcasting are still the main channels of information dissemination, but the proportion of computers and mobile phones is increasing, especially in economically developed areas [10–13]. Researchers summarized and analyzed the influencing factors of farmers’ information demand from various angles, but the correlation analysis between the influencing factors is relatively small, and the statistics and descriptions of the factors are not comprehensive enough [14–16]. At the same time, the significant impact of various factors on rural information demand is insufficient. Aiming at the above problems, the concept model of farmer information demand of “source-flow-use” was put forward. Based on the discrete selection model of econometrics, the probit model of rural information demand was constructed. Firstly, the partial correlation analysis of the influencing factors of rural information demand was carried out, and the high coincidence factor was removed. The probit model was used for the second test. Finally, the ROC curve was used for discrimination. Eight factors with no significant influence, such as the proportion of fixed-line administrative villages, were removed. At the same time, 24 significant influencing factors were ranked according to the degree of influence. The results prove the feasibility of the method.

2. Model Building

2.1. Evaluation of Influencing Factors Based on Partial Correlation Analysis

In a system consisting of multiple elements, when studying the influence or correlation of one element on another, the influence of other elements is regarded as a constant, i.e., the close relationship between the two elements is studied separately without considering the influence of other elements, which is called partial correlation analysis [17, 18]. That is the partial correlation coefficient. In the study of rural information demand, there are many factors involved. There may be some correlations between the factors, which leads to the duplication of information reflected by two or more factors, which leads to the system being too complicated because there are unrelated factors [19]. Through partial correlation analysis, factors with repeated information that affect rural information needs can be removed. (1)Calculation of partial correlation coefficient.

Suppose is the data value of the index of the selected village in the region, is the data value of the index of the selected village in the region, and is the partial correlation coefficient between the index and the first index. The formula is as follows:

Among them, denotes the number of villages in the study area, denotes the average value of the factor, and denotes the average value of the factor.

Suppose is a correlation coefficient matrix composed of partial correlation coefficient , where is the number of influencing factors, then.

Let be the inverse matrix of the correlation coefficient matrix .

According to the formula of partial correlation coefficient, the partial correlation coefficient between the factor and the factor can be obtained.

The greater the partial correlation coefficient is, the greater the correlation between the and the influencing factors is. And the smaller the is, the smaller the correlation between the and the influencing factors is. (2)Calculation of value

When the correlation between the two factors is high, in order to avoid the subjective deletion of the significant factors, we can solve this problem by calculating the value of the two factors. Assuming that is the value of the factor, Equation (5) can be used for calculation.

reflects the magnitude of the influence of the factor on rural information demand; the greater the is, the greater the impact is; on the contrary, the smaller the impact on rural information demand is.

In the multivariate analysis of rural information demand factors, pure correlation analysis cannot fully reflect the correlation between the factors, because other factors interfere with these factors, so partial correlation analysis is an effective way to solve this problem [20]. (3)Set the deletion criterion based on partial correlation analysis

If the absolute value of the partial correlation coefficient of two related factors , it is considered that the two factors are highly correlated, and the information of the two factors response is highly repeatable, so one of them should be deleted. If the partial correlation coefficient is greater than 0.7, the factor whose value is less than 0.7 should be deleted.

2.2. Analysis of Influencing Factors Based on Probit Regression

2.2.1. Discrete Probit Regression Model

The probit model is a generalized linear model that follows a normal distribution [20]. The simplest probit model is that the explanatory variable is a 0, 1 variable, and the probability of an event occurring depends on the explanatory variable , that is, the probability of is a function of , where obeys the standard normal distribution. This paper will use the probit model to screen out the factors affecting the information demand in rural areas. When the value of dependent variable is 1, it shows that independent variable has an impact on rural information demand, and when the value of dependent variable is 0, it shows that independent variable has no effect on rural information demand. (1)Introducing intermediate variables

Because the probit model is a linear model, and the dependent variable is 0 and 1, it is a discrete variable, so it cannot be directly calculated by linear regression equation. Therefore, it can be solved by introducing intermediate variable and fitting linear regression equation with influencing factors. can represent a state of rural information demand; when and the value of is 1, think that this factor has an impact on rural information demand; when , think that the value of is 0, and this factor has no impact on rural information demand. The linear regression equation is given below.

is an intermediate variable, representing the rural information demand state of the village; represents the regression coefficient of the influencing factor; represents the observed value of the influencing factor of the village; is a constant term; is a random variable and obeys normal distribution ; is a regression coefficient vector, and is a vector composed of the influencing factors of the village. (2)Calculate the probability of rural information demand in each village

The intermediate variable of Equation (10) is used to calculate the probability of rural information demand in each village. Because of , it is concluded that

Similarly, it is possible to calculate the probability of unaffected information demand in rural areas:

Where is a normal distribution function, it can be solved by Equation (12) through maximum likelihood estimation.

2.2.2. Testing Based on the Probit Model

Construct a probit model, establish the Wald statistic of the influencing factors, and use the chi-square test [21, 22]. When the corresponding significance probability is greater than 0.01, the factors with the greatest significance probability are deleted. The specific steps are as follows: (1)Calculate the regression coefficient of the probit model. The probit regression model was constructed according to Equations (9) and (12) of factors affecting rural information demand and the corresponding observed values of rural information demand state . The corresponding coefficients , and corresponding standard errors are solved, where (2)Calculate the significance probability of each factor , construct the Wald statistics of each factor, and test the hypothesis of the significance of each factor

Suppose : . If , the factor has no significant impact on rural information demand.

Suppose : . If , then the factor has a significant impact on the rural information demand.

Let be the Wald statistical variable corresponding to the influencing factor of rural information demand, be the parameter estimation value of the influencing factor, and be the standard error of , then.

By constructing the Wald statistic , it is possible to test whether the parameter estimation of the influence factors is significantly 0. If , is true. obeys the chi-square distribution with degree of freedom 1, that is ; the corresponding significance probability value is obtained according to the chi-square distribution table. (i)If , the original hypothesis is rejected, which shows that this factor has a significant impact on the rural information demand(ii)If , then accept the original hypothesis , indicating that although , but this factor has no significant impact on rural information needs(3)For all the influencing factors of significant probability , the maximum value is removed. shows that accepting the hypothesis , this factor has no significant impact on rural information demand. Among all the factors that have no significant impact, the factors corresponding to the maximum value can be removed. It should be noted that all factors affecting cannot be deleted at one time, because each factor may be affected by multiple variables, deleting a variable; the original nonsignificant factors may become significant factors(4)Repeat Steps (1)–(3) until the coefficients of all variables in the model meet

By solving the state variable of rural information demand and the coefficient of probit regression equation between influencing factors and its standard error , construct Wald statistics of influencing factors to test the significance probability of regression equation coefficient and eliminate the factors that have little impact on rural information demand, and the regression coefficient is not significant.

2.3. Validation of Influencing Factors Based on ROC Curve

2.3.1. ROC Curve

The ROC curve refers to the receiver operating characteristic. Each point on the ROC curve reflects the sensitivity to the same signal stimulus [23, 24]. In view of the relationship between the predicted value and the true value, we can divide the sample into four parts: true positive (TP): the predicted value and the true value are all 1; false positive (FP): the predicted value is 1, and the true value is 0; true negative (TN): the predicted value and the true value are both 0; and false negative (FN): the predicted value is 0, and the true value is 1. The classification confusion matrix is shown in Table 1.

The vertical axis of the ROC curve represents true positive rate (TPR), and the horizontal axis represents false positive rate (FPR).

ROC curve is actually a dot plot of TPR and FPR under different thresholds. Given a threshold, we can get the corresponding TPR and FPR values. By detecting a large number of thresholds, a TPR-FPR correlation map can be obtained. In AUC (area under the curve), that is, the larger the area under the ROC curve is, the better the classifier is, the maximum value is 1.

2.3.2. Inspection of Influencing Factors of Rural Information Demand Based on ROC Curve

The ACU value of ROC curve is used to determine whether the factors affecting rural information demand selected by the probit regression model are correct [25]. According to the confusion classification matrix, the number of influential factors is recorded as TP, the number of factors misjudged as influential factors is recorded as FN, the number of factors judged as unaffected factors is recorded as FP, and the number of factors misjudged as unaffected factors is recorded as TN. The specific analysis results are shown in Table 2.

According to Equation (14), the correct discriminant rate is calculated, and the number TP which is discriminated as the influential factor is divided by the number which is the actual number of all the influential factors. It indicates that the factors that actually affect the rural information demand are discriminated as the probability of influencing factors by the abovementioned probit model [26].

According to Equation (15), the misjudgment rate is calculated, and the number of factors which are misjudged as influential factors is divided by the number of factors that are not actually affected by the number of . It is indicated that the factors that have no influence on rural information demand are identified as influential factors by the abovementioned probit model.

The ROC curve is plotted on the longitudinal axis and the horizontal axis, respectively, by the correct discriminant rate and false discrimination rate [27]. When the abscissa is constant, the larger the ordinate is, the greater the impact of this factor on rural information demand is, and the corresponding AUC value is also larger. Therefore, the larger the AUC value is, the better the classifier is, which means that the greater the impact of this factor on rural information needs is, the maximum value is 1. When , it is a ideal classifier, and with this prediction model, ideal prediction can be achieved no matter what threshold is set. When , the influence factor is better. If the threshold is set properly, the model has better predictions. When , the influence factors are moderate, and the model has a certain predictive value. When , the discriminant effect is poor, and there is basically no predictive value. Where , the discriminant effect of the model is very poor, but it is better than random guess as long as it always goes against prediction.

Therefore, according to all the factors identified by the above probit regression model, if the AUC value is greater than 0.9, it is concluded that this factor has a significant impact on rural information demand. The research shows that the area under the ROC curve constructed by all the factors in this paper is higher than 0.9, which ensures the ability to distinguish the influence of various factors on rural information demand.

3. Empirical Analysis of Rural Information Demand

3.1. Analysis of Influencing Factors of Rural Information Demand

Through the combing and research of domestic and foreign literatures, the factors affecting rural information demand are summarized into seven aspects: environmental factors, subject factors, family factors, economic factors, geographical factors, cognitive factors, and political factors [28, 29].

3.1.1. Environmental Factors

At the micro level, the popularity of the Internet, the number of computers, and the number of mobile phones, television, and radio coverage have become important factors affecting rural information needs. First, rural information infrastructure and technology are the basic resources of rural information environment and an important premise of rural information environment optimization. Its construction level is an important part of rural information environment. The second is rural information talents. The optimization of rural information environment needs high-quality and professional talent team to achieve, in order to continuously promote the improvement of rural informatization level. Rural scientists and technicians are an important force in the construction of rural information environment and an important guarantee for the continuous advancement of rural informatization. Rural college students have higher professional quality and professional ability, which is an important force in the future construction and optimization of rural information environment. The third is the rural information network coverage. It reflects the application of rural information infrastructure. The four is the input and output of rural informatization.

3.1.2. Subject Factors

Individual characteristics mainly include gender, age, marital status, health status, educational level, occupation, personal income, and migrant work experience. Gender is an important factor affecting rural information need. Generally speaking, men’s demand for information is more intense than that of woman. From the perspective of information economics, the subjective desire of different age structures for rural information needs is quite different. Young people are more likely than the elderly to accept new information technology and information products. The impact of marital status on rural information needs research results that are rare, and it is unclear whether there is a correlation. This paper will explore this issue through follow-up models. Health status is also a major impact on rural information needs. The cultural level affects the information quality of rural subjects to a great extent. The traditional theory of rural informatization holds that the farmers’ information quality has a positive correlation with the demand and acceptance of informatization. People are engaged in agricultural and nonagricultural occupations in rural areas; the dual nature of occupation may also have an impact on rural information needs. In general, the higher the personal income is, the stronger the demand for information is. Farmers with migrant experience have a wider horizon and a stronger sense of information needs.

3.1.3. Family Factors

Family factors mainly include the number of family population, the number of family labor force, the number of male family, the number of female family, family happiness index, and family income sources. The theory of network externalities believes that as the number of users increases, utility gained by each user from the network increases. Therefore, the number of family members may also be an important factor affecting the information needs in rural areas. Statistical studies have shown that gender is an important factor affecting Internet demand. For rural households, the more males there are, the stronger the rural information demand there is. Similarly, the number of women in the family may also affect the family’s demand for rural information. The quantity of household labor force is proportional to household income to a certain extent. The more the labor is, the higher the household income is, the stronger the demand for information is. On the contrary, the less the labor is, the lower the household income is, the lower the desire for information demand is. Family happiness index in a sense reflects the level of family income and indirectly affects the farmers’ demand for information. At present, the relationship between happiness index and information access demand has not been found in academic and theoretical circles. However, we can see that the higher the family happiness index is, the higher the income is, so it will indirectly affect the farmers’ demand for information.

3.1.4. Economic Factors

Economic factors mainly include the per capita income of farmers, the source of farmers’ income, and the level of regional economy.

3.1.5. Geographical Factors

The geographical characteristics of rural information demand have great influence. The geographical features are mainly reflected in the geographical location of rural areas, including county-level roads, provincial highways, distance from township centers, and distance from county centers.

3.1.6. Cognitive Factors

Cognitive factors have an important impact on rural information needs. Cognitive factors mainly include the cognitive level of rural subjects to information, the awareness of information acceptance, and the ability to receive information.

3.1.7. Policy Factors

It mainly refers to the national policy information on rural informatization. Government informatization policies, such as rural revitalization strategies, rural e-commerce, digital rural areas, and smart rural areas, affect farmers’ perceptions of rural information needs.

In summary, rural information needs are affected by 38 factors in 7 aspects of the environment. This paper uses partial correlation coefficient, probability model, and ROC curve to screen and identify the factors affecting rural information demand, and finally find out the real key factors affecting rural information demand. Specific factors are shown in Table 3.

3.2. Sample Selection and Data Sources

3.2.1. Sample Selection

Because this paper studies the rural information needs, so from the regional survey object selected as the villagers of natural villages. Considering the convenience of data acquisition and the homogeneity of sample division, and covering the plain, hilly, and mountainous terrain in the regional space, this study selected 30 natural villages of 15 townships in Lingshou County, Hebei Province, as the sample. The specific distribution is shown in Table 4.

3.2.2. Data Source

The empirical data mainly come from two aspects: the first is the statistical yearbook data of Lingshou County. Second is the survey data; this part of the data mainly includes interviews with relevant personnel data and sample survey data. The specific data is shown in Table 5.

3.3. Data Standardization

3.3.1. Standardization of Data Indicators

For the data indicators including positive, negative, and interval three categories, respectively, the above formulas are used to calculate the standardized 0-1 interval data [30–32].

3.3.2. Quantitative Processing of Qualitative Data

The qualitative data are quantified by using the Likert scale principle. The specific variable design and its meaning are shown in Table 6.

3.4. Evaluation of Influencing Factors of Rural Information Demand Based on Partial Correlation Analysis

Partial correlation analysis of standardized data is carried out to avoid the correlation of indicators only existing in data and the lack of correlation of economic significance [33]. Using the data in Table 7 and according to Equations (4)–(7), the partial correlation coefficients of each factor can be calculated by SPSS software. The results are shown in Table 8. According to the calculation results, the partial correlation coefficients of six pairs of factors are greater than 0.7, so the six pairs of factors are highly correlated and there is information redundancy. Therefore, it is necessary to further calculate the value of six pairs of related factors. The six related factors are the number of cable TV per 1000 people and the number of TV sets per 100 households in rural areas, the number of information talents per 10000 people and the number of college students per 1000 people, the number of computers per 100 households in rural areas and the number of Internet users per 10000 people, personal income and per capita income of farmers, family income sources and farmers’ income source, distance from county highway, and distance from provincial highway. The specific results are shown in Table 7.

The values of six pairs of related factors are calculated, and the results are shown in Figure 1. At the same time, six pairs of factors with values were compared, and 6 factors with smaller value were deleted. From the data in Table 9, we can see that the value of the number of cable TV per 1000 people is less than the value of the number of television per 100 households in rural areas, the value of the number of college students per 10000 people is less than the value of the number of information personnel per 10000 people, the value of the number of Internet users per 10000 people is less than the value of the number of computers per 100 households in rural areas. The value is smaller than the per capita income of farmers, and the value of the distance between county highway and provincial highway is smaller than that of provincial highway. Therefore, six factors such as X2, X6, X10, X20, X29, and X31 with smaller value are deleted. The specific results are shown in Table 8.

3.5. Analysis of Influencing Factors of Rural Information Demand Based on Probit Regression

On the basis of partial correlation analysis, the remaining factors are screened by using the probit regression model to find out the factors that have a greater impact on rural information demand [34]. After regression analysis of the remaining 32 factors, the relevant regression parameters were calculated. The specific results are shown in Table 9.

The standard error of each factor reflects to a certain extent of the variation degree of sample average to total average [35]. The difference of standard errors of factors shows that there are certain differences in the selection of samples for each factor. However, the significance of this effect on each factor is acceptable.

In the significant probability factor, delete the biggest factor of value. According to this principle, we compare the of all factors in Table 10 to delete the largest one. Probit regression is performed on the remaining 31 factors, and the corresponding regression parameters are calculated until the value of all the factors is less than 0.01. For example, according to the results of the first regression, all values are less than 0.1, but the gender factor has the largest value, so the gender factor is deleted, and then probit regression is performed again until the value of all factors is less than 0.1. Finally, through the probit regression analysis, 8 factors such as the proportion of administrative village, gender, marital status, health status, number of family members, number of male family members, number of female family members, and family happiness index were deleted, which did not significantly affect rural information demand.

3.6. Validation and Analysis of Factors Affecting Rural Information Demand Based on ROC Curve

The data of 24 selected factors were brought into Equations (9)–(12). The probability of each village affected by relevant factors was calculated by using the probit model. When , the effect was obvious, and when , it was not.

First, the AUC value is a probability value. When you randomly select a positive sample and a negative sample, the probability that the current classification algorithm ranks the positive sample before the negative sample according to the calculated score value is the AUC value. The larger the AUC value is, the more likely the current classification algorithm will rank the positive sample before the negative sample, so that they can be better classified.

Specifically, it is to count all ( is the number of positive samples; is the number of negative samples) positive and negative sample pairs; how many groups of positive samples have a score greater than the negative sample score. When the scores of the positive and negative samples in the binary group are equal, the calculation is performed according to 0.5. Then divide by MN. The formula for calculating the AUC value is as follows:

The ROC curve corresponding to 24 factors and the area under the curve (AUC) value were obtained by calculation. The results show that all AUC values are greater than 0.9, indicating that all factors are significantly related to rural information demand. At the same time, according to the rule that the greater the AUC value is, the more significant the demand relationship is, the order of 24 factors is ranked. The impact of every 100 households in rural areas that have mobile phones is most significant. The AUC values for the specific 24 factors are shown in Table 10.

The ROC curve is composed of dot plots of TPR and FPR corresponding to multiple critical values. Therefore, different threshold values can be used to obtain points above the multiple ROC curves, and the TPR and FPR values are used as the horizontal and vertical axes, respectively. The SPSS software draws the most significant factor. The ROC curve of the number of mobile phones per 100 households in rural areas is shown in Figure 2.

The area below the ROC curve indicates that the AUC value reflects the significant impact of the number of mobile phones per 100 households in rural areas on rural information demand. In Figure 2, is greater than 0.9, so there are 100 rural households screened by the probit model. The number of mobile phones has a significant impact on rural information needs.

4. Conclusion

This chapter mainly analyzes and studies the information demand problem caused by the lack of rural information supply as a whole, and obtains the following conclusions: (1)The traditional factor analysis of rural information demand does not consider the correlation between factors, so the factors themselves carry a lot of redundant information, which is a certain interference to the judgment of the impact degree. Taking Lingshou County as an example, using the method of partial correlation analysis, by calculating value, the influencing factors with highly repetitive information are eliminated, and the complexity of calculation is reduced. The probit regression model is constructed to test the influencing factors of rural information demand. Through the comparison of regression coefficient and test probability, the nonsignificant correlation of rural information demand is deleted, and ROC curve is introduced to test the above results twice, which improves the reliability of factor correlation(2)The 24 influencing factors of rural information demand directly or indirectly affect the supply of rural information services. They provide the basis for the supply of rural information services from the seven aspects of objective environment, subject characteristics, family, economy, geography, cognition, and policy, such as improving infrastructure construction, training information service talents, and providing differentiation, and at the same time, the research results also show that the supply of rural information is related to farmers’ information cognitive ability, acceptance awareness, and acceptance ability

The innovation of this paper lies in the partial correlation analysis of influencing factors of rural information demand and the ROC secondary test. It provides a new idea and method to solve the related problems.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research is supported by the “Three Three Three Talent Project” funded by Hebei Province (Project No.: A202001064).

References

A. Antunes, D. Bonfim, N. Monteiro, and P. M. M. Rodrigues, “Forecasting banking crises with dynamic panel probit models,” International Journal of Forecasting, vol. 34, no. 2, pp. 249–275, 2018.
View at: Publisher Site | Google Scholar
A. E. Elshaikh, X. Jiao, and S.-h. Yang, “Performance evaluation of irrigation projects: theories, methods, and techniques,” Agricultural Water Management, vol. 203, pp. 87–96, 2018.
View at: Publisher Site | Google Scholar
A. M. Valente, H. Binantel, D. Villanua, and P. Acevedo, “Evaluation of methods to monitor wild mammals on Mediterranean farmland,” Mammalian Biology, vol. 91, pp. 23–29, 2018.
View at: Publisher Site | Google Scholar
A. I. Bandos, B. Guo, and D. Gur, “Estimating the area under ROC curve when the fitted binormal curves demonstrate improper shape,” Academic Radiology, vol. 24, no. 2, pp. 209–219, 2017.
View at: Publisher Site | Google Scholar
U. Benjamin and U. CLN, “Libraries and information in Nigerian rural development,” International Journal of Information Management, vol. 34, no. 1, pp. 14–16, 2014.
View at: Publisher Site | Google Scholar
C. A. Damalas and M. Khan, “RETRACTED: Pesticide use in vegetable crops in Pakistan: insights through an ordered probit model,” Crop Protection, vol. 99, pp. 59–64, 2017.
View at: Publisher Site | Google Scholar
G. Msoffe and P. Ngulube, “Farmers access to poultry management information in selected rural areas of Tanzania,” Library & Information Science Research, vol. 38, no. 3, pp. 265–271, 2016.
View at: Publisher Site | Google Scholar
H. Hu, B. Tang, X. Gong, W. Wei, and H. Wang, “Intelligent fault diagnosis of the high-speed train with big data based on deep neural networks,” IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp. 2106–2116, 2017.
View at: Publisher Site | Google Scholar
G. Fountas and P. C. Anastasopoulos, “A random thresholds random parameters hierarchical ordered probit analysis of highway accident injury-severities,” Analytic Methods in Accident Research, vol. 15, pp. 1–16, 2017.
View at: Publisher Site | Google Scholar
G. Zhang, C. Zhang, and H. Zhang, “Improved K-means algorithm based on density canopy,” Knowledge-Based Systems, vol. 145, pp. 289–297, 2018.
View at: Publisher Site | Google Scholar
H. S. Loh, Q. Zhou, V. V. Thai, Y. D. Wong, and K. F. Yuen, “Fuzzy comprehensive evaluation of port-centric supply chain disruption threats,” Ocean & Coastal Management, vol. 148, pp. 53–62, 2017.
View at: Publisher Site | Google Scholar
J. A. Cook, “ROC curves and nonrandom data,” Pattern Recognition Letters, vol. 85, pp. 35–41, 2017.
View at: Publisher Site | Google Scholar
J.-K. Park, S.-K. Lee, and J.-H. Kim, “Development of an evaluation method for nuclear fuel debris–filtering performance,” Nuclear Engineering and Technology, vol. 50, no. 5, pp. 738–744, 2018.
View at: Publisher Site | Google Scholar
J.-F. Chen, H.-N. Hsieh, and Q. H. Do, “Evaluating teaching performance based on fuzzy AHP and comprehensive evaluation approach,” Applied Soft Computing, vol. 28, pp. 100–108, 2015.
View at: Publisher Site | Google Scholar
J. J. C. Tambotoh, A. D. Manuputty, and F. E. Banunaek, “Socio-economics factors and information technology adoption in rural area,” Procedia Computer Science, vol. 72, pp. 178–185, 2015.
View at: Publisher Site | Google Scholar
Y. Jin, G. Li, and H. Zhang, “Evaluation of regional rural information environment based on fuzzy method in the era of the Internet of things,” IEEE Access, vol. 6, pp. 78530–78541, 2018.
View at: Publisher Site | Google Scholar
K. Kwon, J. W. Shin, and N. S. Kim, “Incremental basis estimation adopting global k-means algorithm for NMF-based noise reduction,” Applied Acoustics, vol. 129, pp. 277–283, 2018.
View at: Publisher Site | Google Scholar
L. Zhang, Y. Feng, P. Shen et al., “Efficient finer-grained incremental processing with MapReduce for big data,” Future Generation Computer Systems, vol. 80, pp. 102–111, 2018.
View at: Publisher Site | Google Scholar
K. Papangelis, N. R. Velaga, F. Ashmore, S. Sripada, J. D. Nelson, and M. Beecroft, “Exploring the rural passenger experience, information needs and decision making during public transport disruption,” Research in Transportation Business & Management, vol. 18, pp. 57–69, 2016.
View at: Publisher Site | Google Scholar
M. de Figueiredo, C. B. Y. Cordella, D. J.-R. Bouveresse, X. Archer, J.-M. Bégué, and D. N. Rutledge, “A variable selection method for multiclass classification problems using two-class ROC analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 177, pp. 35–46, 2018.
View at: Publisher Site | Google Scholar
M. F. M. Firdhous and P. M. Karuratane, “A model for enhancing the role of information and communication technologies for improving the resilience of rural communities to disasters,” Procedia Engineering, vol. 212, pp. 707–714, 2018.
View at: Publisher Site | Google Scholar
M. Filippini, W. H. Greene, N. Kumar, and A. L. Martinez-Cruz, “A note on the different interpretation of the correlation parameters in the bivariate probit and the recursive bivariate probit,” Economics Letters, vol. 167, pp. 104–107, 2018.
View at: Publisher Site | Google Scholar
P. Mozharovskyi and J. Vogler, “Composite marginal likelihood estimation of spatial autoregressive probit models feasible in very large samples,” Economics Letters, vol. 148, pp. 87–90, 2016.
View at: Publisher Site | Google Scholar
P. Matous, “Complementarity and substitution between physical and virtual travel for instrumental information sharing in remote rural regions: a social network approach,” Transportation Research Part A: Policy and Practice, vol. 99, pp. 61–79, 2017.
View at: Publisher Site | Google Scholar
R. Khajouei, S. H. Gohari, and M. Mirzaee, “Comparison of two heuristic evaluation methods for evaluating the usability of health information systems,” Journal of Biomedical Informatics, vol. 80, pp. 37–42, 2018.
View at: Publisher Site | Google Scholar
R. Fattahi and M. Khalilzadeh, “Risk evaluation using a novel hybrid method based on FMEA, extended MULTIMOORA, and AHP methods under fuzzy environment,” Safety Science, vol. 102, pp. 290–300, 2018.
View at: Publisher Site | Google Scholar
R. H. Lange, “The predictive content of the term premium for GDP growth in Canada: evidence from linear, Markov-switching and probit estimations,” The North American Journal of Economics and Finance, vol. 44, pp. 80–91, 2018.
View at: Publisher Site | Google Scholar
S. T. Yen and E. M. Zampelli, “Religiosity, political conservatism, and support for legalized abortion: a bivariate ordered probit model with endogenous regressors,” The Social Science Journal, vol. 54, no. 1, pp. 39–50, 2017.
View at: Publisher Site | Google Scholar
S. Han and E. J. Vytlacil, “Identification in a generalization of bivariate probit models with dummy endogenous regressors,” Journal of Econometrics, vol. 199, no. 1, pp. 63–73, 2017.
View at: Publisher Site | Google Scholar
T.-t. Gao and S.-m. Wang, “Fuzzy integrated evaluation based on HAZOP,” Procedia Engineering, vol. 211, pp. 176–182, 2018.
View at: Publisher Site | Google Scholar
W. Yang, K. Xu, J. Lian, L. Bin, and C. Ma, “Multiple flood vulnerability assessment approach based on fuzzy comprehensive evaluation method and coordinated development degree model,” Journal of Environmental Management, vol. 213, no. 1, pp. 440–450, 2018.
View at: Publisher Site | Google Scholar
W. Li, W. Liang, L. Zhang, and Q. Tang, “Performance assessment system of health, safety and environment based on experts’ weights and fuzzy comprehensive evaluation,” Journal of Loss Prevention in the Process Industries, vol. 35, pp. 95–103, 2015.
View at: Publisher Site | Google Scholar
W. Cai, L. Dou, M. Zhang, W. Cao, J.-Q. Shi, and L. Feng, “A fuzzy comprehensive evaluation methodology for rock burst forecasting using microseismic monitoring,” Tunnelling and Underground Space Technology, vol. 80, pp. 232–245, 2018.
View at: Publisher Site | Google Scholar
X. Yu, W. Meng, and L. Xiang, “Comprehensive evaluation chronic pelvic pain based on fuzzy matrix calculation,” Neurocomputing, vol. 173, Part 3, pp. 2097–2101, 2016.
View at: Publisher Site | Google Scholar
Y. Jin and G. Li, “Application of improved K-means algorithm in evaluation of network resource allocation,” Boletín Técnico, vol. 55, no. 5, pp. 284–292, 2017.
View at: Google Scholar

Copyright

Copyright © 2020 Yanfeng Jin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

477

Downloads

630

Citations