Abstract

The spatial and temporal distribution of the higher-education population (HEP) is a fundamental characteristic of the development level of higher education in a region or a country. Based on the annual population sampling statistics from 2000 to 2015, the spatiotemporal evolution pattern of the HEP in China is systematically analyzed. Meanwhile, 9 driving factors related to natural conditions and socioeconomic conditions of average slope, average elevation, the city location, the city size, high-speed railways, highways, gross domestic product (GDP) density, nonagricultural population, and population density of 2000 and 2010 at the municipal level are constructed. Then, the factors driving the distribution of the HEP are quantitatively analyzed using the geodetector model. The results show that the centroid of the HEP, shifting from the northeast to the southwest from 2000 to 2010, is markedly different from that of the total population from 2000 to 2015 in China. Despite their different moving directions, the distance between the two centroids is decreasing, indicating both significant regional differences of the HEP in China and a narrowing gap between the HEP and the total population in recent years. The results of the factor detector of 2000 and 2010 suggest that the proportion of the nonagricultural population and the city location are the main driving factors of the distribution of the HEP, with driving forces between 0.494 and 0.627, followed by the city size, highways, and GDP density, with driving forces are between 0.199 and 0.302. It indicates that urbanization levels and urban locations are the main factors affecting the spatial distribution of the HEP. The results of the interaction detection reveal that the interaction of the nonagricultural population and the GDP density can explain 92.7% of the spatial variety of the HEP in 2000, while that of the nonagricultural population and the population density can explain 97.6% of the spatial variety of the HEP in 2010, which reflects a more balanced development of the HEP. In addition, a large proportion of the HEP transfers from economically developed areas to densely populated areas.

1. Introduction

With globalization and the progress in science and technology, high-tech has become the main driving force of the global economic growth and the focus of global trade disputes. Research shows that the higher-education population (HEP) is an important factor determining the level of science and technology in a region [1, 2]. Since the implementation of China’s reform and opening-up policies, China’s economy has been growing at a high speed, and the level of education has been rising rapidly. In particular, the growth of the HEP is considerably faster than that of the total population. However, China’s economic growth is extremely unbalanced in different regions [3]. Therefore, a scientific understanding of the spatiotemporal evolution pattern and the driving factors of the HEP is of great significance in promoting both the economic development and the level of science and technology in a region [4, 5].

Previous studies have studied various driving factors on HEP, including natural conditions, economic level, traffic conditions, and educational services [68]. China publishes provincial-level national population sampling data annually and conducts a detailed national census every decade [9]. Until now, six national censuses have been conducted. Based on these data, researchers have studied the distribution characteristics of the HEP in China from the aspects of quantity, regional imbalance, driving forces, etc. [10]. Hu et al. discussed the impact of social and economic transformation on the spatial distribution of the HEP in China, and results show intraregional spatial transformation affects the distribution of population [4]. Qin analyzed the spatial structure and the education quality evolution in China from 1982 to 2005 and concluded that different relations existed between the HEP and the economic development in different regions, including regional education quality of population is higher in northeast of China and lower in southwest, and education quality of people has been greatly improved with different degrees in different provinces, etc. [11]. Chen & Lei, based on the data of the four censuses from 1982 to 2010, concluded that the regional differences in the number of the HEP show a trend of “expansion-reduction” in China at provincial administrative units [12]. Dong & Liu studied the spatial agglomeration and influencing factors of the HEP in the region of Beijing-Tianjin-Hebei, concluding that the compulsory education resources and the economic agglomeration level were the main factors affecting the HEP [13].

However, the previous research has deficiencies. Firstly, the analysis unit is rough. Due to the data limitation, most of the studies have chosen data at the provincial level to examine the distribution of the HEP [14, 15]. Nevertheless, China’s provincial administrative is incredibly large [16]. For instance, there are 8 provinces with a population of over 50 million at the end of 2017. Hence, the aggregated data at the provincial level cannot reflect the distribution of the HEP. Secondly, though scholars have studied the spatial and temporal distribution of the HEP [1719], comprehensive research from an integrated spatiotemporal perspective is seldom conducted, while the HEP is precisely characterized by “space-time correlation”. Thirdly, most of the studies focus on qualitative analysis, lacking quantitative descriptions of the influencing factors of the HEP, which may omit some significant driving forces of the distribution and trend of HEP [20].

Previous studies have indicated the interaction of HEP and various spatiotemporal factors [8, 2123]. Considering previous studies and data availability, nine driving factors of average slope, average elevation, the city location, the city size, high-speed railways, highways, gross domestic product density, nonagricultural population, and population density are constructed, which can express natural conditions and socioeconomic conditions. This study is divided into three parts. In the first part, we use the provincial sample data to study the spatiotemporal evolution pattern of the HEP by the annual centroids of the HEP from 2000 to 2015 in China. The second part applies the geodetector model to the analysis of the distribution pattern and driving forces of the HEP with district cities as units of analysis based on China’s census data of population in 2000 and 2010. The third part summarizes the driving factors, research prospects, and policy recommendations of and for the HEP.

2. Data and Methodology

2.1. Data

The panel data of the population sampling of provincial units from 2000 to 2015 and the national census data of 2000 and 2010 were collected (China conducted a national population census every 10 years). According to the statistical calibre of China, education above the junior college level is regarded as higher education. This paper defines the population with education above the junior college level as the higher-education population (HEP). The HEP distribution is influenced by many factors. Based on previous studies [14, 24], 10 types of socioeconomic and natural environment-related data were collected, including the total population, the GDP, topographic data, administrative division data, city data, and railway and highway data (Table 1).

Based on the multisource data, 9 factors reflecting the natural conditions, economic development, and traffic conditions were constructed, as shown in Table 2.

The municipal scale is the basic unit of national economic management, considered by researchers as an appropriate unit for population research [25, 26]. Since China’s administrative divisions had undergone considerable readjustment before 2000, the data of the fifth and sixth national census conducted in 2000 and 2010 are comparable in terms of the statistical calibre [25, 27]. Through standardization, 344 cities were selected as the analysis units. Considering different formats and accuracy of various data, GIS technology is used to process different data. Overlaying, zonal statistics, area-weighting method, and other methods are applied to the integration of various data into municipal units.

2.1.1. Processing of Gridded Data

Topographic data and GDP data are raster data with different resolution. Therefore, the spatial statistical method was used to process the data. For example, the GDP dataset was aggregated with the city boundary data by GIS software. The slope of a city was calculated by the average of value of grids overlaid by the city boundary [28]:where i denotes the city, j is a raster data, k is a grid in j, and n is the number of grids in j covered by the boundary of i.

2.1.2. Processing of Vector Data

Containing GIS layers of points, lines, and polygons, vector data were united with city boundary layers. We assigned city names to all the objects of the layers before summarizing the indicators of each city with the dissolve operation.

All the above operations were conducted within the ArcGIS10.3 software. Figure 1 shows an example of the data processing.

2.2. Methodology
2.2.1. Spatiotemporal Pattern of HEP Centroids on Provincial Scale

Centroid variation is an effective method used to study the evolution of massive spatiotemporal data [29]. The centroid coordinates of the HEP for each (from 2000 to 2015) year were calculated. Coordinates and for the HEP in one year were calculated using equation (2), in which provincial-level administrative regions were used as basic units and the HEP as the weight:where a denotes the year, m represents the number of provincial-level administrative regions, refers to the number of the HEP of provincial-level administrative region i in year a, and and indicate coordinates x and y of the geometric center of provincial-level administrative region i.

The method for calculating the centroid of the total population each year is similar to equation (2).

2.2.2. Exploration of Spatial Correlation between Factors and HEP

The HEP distribution is related to factors such as the social economy, the natural environment, and the infrastructure. In this study, the geographic detector model is applied to the calculation of the spatial correlation to quantitatively evaluate the influences of the driving factors on the HEP, which is based on our former research result (Liu et al.) [30].

The geographical detector model is developed based on the geographical spatial differentiation theory, which is based on our former research result (Liu et al.) [30]. This tool is widely used in spatial analysis, and it is valuable for identifying association or overlaying between dependent variable Y and independent variables X, according to the consistency of their spatial distributions [31]. The tool consists of a factor detector and an interaction detector, which were used for analysis in the present study. For analysis using the geographical detector, continuous factor data were discretized and classified into 10 levels using the natural classification method, based on the results of previous studies and experiments [31, 32].

(1) Factor detector: the association between HDP and variables is determined using the power of determinant (PD), as shown in the following equation:where PD represents accountability, which is the degree to which an influential factor accounts for the spatial distribution of the density or the ratio of the HEP; h = 1, …; L refers to the number of strata for a variable factor; Nh and N, respectively, represent the number of units in strata h and all the strata; and and , respectively, denote the variance in the index of the HEP in strata h and in all the strata. The value range of PD is (0, 1), with larger PD values indicating a greater correlation between the spatial distribution and the index of the HEP.

(2) Interaction detector: the interaction detector reveals whether the factors X1 and X2 (and more X) have an interactive influence on a target Y. GIS software was used to stack the X1 and X2 geographical layers and obtain a new geographical layer E (Figure 2).

By comparing the values of PD between X1, X2, and layer E, we are able to determine the influence of the interaction, and the interaction relationship is determined by the location of PD (X1∩X2) in the 5 intervals (Table 3) [30].

3. Results and Discussion

3.1. Spatiotemporal Evolution of HEP

Figure 3 shows the changes of the HEP density in all provinces from 2000 to 2015. Figure 4 displays the changes of the total population density in all provinces in the same period.

Figure 3 shows that the HEP has increased in all provinces, despite the markedly different growth rate. Overall, the HEP shows a trend of concentration in southeastern China, with the ten provinces along the eastern coast increasing remarkably by more than 25%. The seven central provinces and Ningxia adjacent to them increase by more than 10%. The remaining provinces are mainly located in the northwest, with an increase of 1–10%. In contrast, although the total population of most provinces has also increased, that of Sichuan, Chongqing, and Hubei provinces has decreased. It is found that the HEP in these three provinces is not affected by population reduction, indicating that the population reduction in these three provinces is mainly due to the non-higher-education population. The comparison between Figures 3 and 4 shows that the growth rate of the HEP is significantly lower than that of the total population in Yunnan, while in central provinces such as Henan, Hubei, Hunan, Chongqing, and Liaoning and Jilin in northeast China, the growth rate of the HEP is significantly higher than that of the total provincial population. It reveals that the inflow of the HEP or the outflow of the non-higher-education population in these provinces is higher than that in other provinces.

Figure 5 maps the centroids of the HEP and the total population over the years, revealing their unbalanced evolution. The results (Figure 5) are consistent with those in Figures 3 and 4. Figure 5 displays three prominent features: (1) the locations of the HEP centroids differ considerably from those of the total population, indicating the regional imbalance of the HEP. Meanwhile, the centroids of the HEP and the total population gradually draw near, indicating that the imbalance is shrinking. (2) The centroids of the HEP move more drastically (points on the map are more discrete), indicating the larger mobility of the HEP. (3) The general tendency of centroids of the HEP moves from the northeast to the southwest, and contrarily, those of the total population move from the northwest to the southeast, which is related to the economic development models of different regions and the demands of the HEP. Interestingly, the centroid of 2011 was an outlier. We check the data and find that this is because the growth rate of HEP in Guangxi, Guizhou, and Southwest Yunnan increased by nearly 20% in 2011 compared with the previous year, but the growth rate in 2012 was relatively low in China. The reason may be related to the local talent attraction policy.

3.2. Spatial Correlation of Factors and HEP

We also calculated the ratio of the HEP in each city in 2000 and 2010 (Figure 6). As can be seen in Figure 6, the spatial pattern of the HEP has not changed much in the two periods. There are two focuses in the HEP ratio: the northern and the northeastern China. Although the economic development level of these two areas is not the highest and their population density is not large, the ratio of their HEP is very high. The main reason is that the urbanization ratio of these two areas is high [32, 34]. The difference of the HEP ratio between the provincial capitals and other cities is conspicuous. Statistics show that the average HEP ratios of the provincial capitals and other cities are 8.63% and 2.72% in 2000 and 18.48% and 6.85% in 2010, respectively. In addition, the global Moran I of the total population and the HEP in 2000 and 2010 were calculated [35]. The indexes of the total population are 0.36 and 0.31, and those of the HEP are 0.10 and 0.13, respectively. Moran I shows that both the total population and the HEP have spatial aggregation and that the aggregation degree of the HEP is lower than that of the total population. In terms of time variation, the aggregation degree of the HEP increased slightly in 2010, while that of the total population decreased.

The geographic detector model was used to calculate the power determinant (PD) of each factor on the HEP in 2000 and 2010. The results are shown in Tables 46. Table 4 displays the results of the single factor detector on the HEP in 2000 and 2010. Tables 5 and 6 present the results of the interaction detection in 2000 and 2010, respectively.

Table 4 shows that the largest driving forces on the HEP in 2000 and 2010 are nonagricultural population and city location, both of which are close to or over 0.5, followed by city size, highway, and GDP, which are all between 0.2 and 0.3. Population density, high-speed railway, slope, and elevation have the smallest impact, close to or below 0.2. Nonagricultural population is closely related to the level of urbanization, and the results show that the level of urbanization directly affects the HEP. Figure 6 shows that the highest HEP ratio appears in cities in the north and northeast of China, consistent with the distribution of the urbanization level [32]. However, in 2010, the PD between these two declined by 10%, indicating a larger HEP shift toward regions with lower urbanization. The PD of city size reaches 0.5, which indicates that the HEP tends to concentrate in central cities, particularly provincial capitals (Figure 6). For example, statistics show that the proportion of the HEP in Beijing is the highest of all the cities in 2000 and 2010, exceeding 10% and 30%, respectively. Highway has a greater PD than high-speed railway; however, the comparison of the data in 2010 and 2000 shows that the PD of high-speed railway increased slowly. A possible reason is that the construction of high-speed railways started only 10 years before, and the influence has not been fully reflected.

Tables 5 and 6 are the results of the interaction detection of all the factors. Twenty-one and twenty-three pairs of PDs were nonlinear enhanced in 2000 and 2010, respectively, showing that the spatial pattern of the HEP is the result of comprehensive influences of many factors and it is more so in 2010. In 2000, the PDs of nonagricultural population and GDP were the highest, while in 2010, the PDs of nonagricultural population and TPD ranked the top, which had a certain relationship with the policy of “quantity to quality” for the economic development in China. Despite the low PDs of natural conditions (slope and elevation), the PDs of their interaction with other factors were considerably enhanced, indicating that natural conditions are a basic factor. Other studies have also drawn similar conclusions that natural conditions cannot determine the level of economic development; however, good natural conditions are necessary for rapid economic development.

3.3. Discussion

The effect of HEP on economic development is much higher than its proportion in the total population. There are obvious spatial differences in the proportion of HEP between different regions. The factor of GDP shows the impact of economic development on HEP. In 2000 and 2010, the impact of GDP on HEP was only about 0.2; however, interaction of GDP and other factors shows increase of PD significantly, especially slope and elevation, which represent natural conditions. This shows that the economic level is an important factor affecting HEP in the regions with little difference in natural conditions.

In this paper, the natural environment and human activities are integrated innovatively, and the spatial analysis method of GIS is used to quantize the influence of various factors on the spatial distribution of HEP. The method and conclusion of this study is much different from the previous quantitative analysis method, and it is some innovative in the field of demography and geography.

The method of spatial statistics is employed to analyze the spatiotemporal evolution pattern of the HEP in 2000 and 2010 and quantitatively evaluate the relationship between the related factors and the HEP. Although the statistical data of different years have some differences in the sampling proportion, the analysis results can well reflect the actual situation of the HEP because of the effective quality control of the sampling data. In addition, the date of data acquisition is also a problem to be considered. For instance, the relationship between the HEP and the GDP is complex. It could be the case that the development of the GDP precedes the aggregation of the HEP or that the latter fuels the former. Therefore, follow-up studies should consider similar effects.

Although results show that there is a strong statistical correlation between the HEP and the factors, due to the complex interaction between the HEP and factors such as natural conditions and social economy, the logical connection between them may be not direct, which needs to be considered comprehensively considering various factors, such as serious air pollution and household registration system restrict.

During the research period, China experienced rapid development of social economy and higher education, including the expansion of enrollment and the merger of colleges and universities since 1999 and the rapid growth of China’s Internet industry. These events have a significant impact on HEP. In this paper, nine factors such as natural conditions, economic development, traffic conditions, and others are adopted to analyze their influences on the HEP. However, some other important factors are not included. For example, in recent years, the fog and haze in northern China and the national macro policies such as the Belt and Road Initiative, the construction of Hainan Free Trade Zone, the construction of high-speed railways, the household registration policy, and other policies will definitely affect the choice of employment locations of the HEP and therefore need to be considered in subsequent research.

4. Conclusions

Based on the annual population sampling statistics from 2000 to 2015 and the population census data of China in 2000 and 2010 at municipal level, this paper systematically analyzed the spatiotemporal evolution pattern of the HEP and the effects of the influencing factors on the spatial distribution of the HEP. Results show that the yearly centroids can effectively reveal the temporal and spatial evolution of the HEP, and the geographic detector model can quantitatively evaluate the impact of various factors on the HEP.

From 2000 to 2015, the centroids of the HEP in China are markedly different from those of the total population, with the centroids of the HEP mainly in the northeast. However, from 2000 to 2016, the centroids of the HEP have started to shift from the northeast to the southwest. Although inconsistent with the shifting direction of those of the total population, the distance between them is drawing closer, indicating that the large regional gap of the HEP in China has been closing in recent years. The analysis of the influencing factors on the spatial distribution of the HEP shows that the NAP and the CL are the main factors affecting the distribution of the HEP, with driving forces between 0.494 and 0.627, followed by the CS, the HW, and the GDP, with driving forces between 0.199 and 0.302. The interaction of the NAP and the GDP can explain 92.7% of the spatial variety of the HEP in 2000, and that of the NAP and the TPD can explain 97.6% in 2010, indicating a more balanced proportion of the HEP in 2010.

With the progress of the society, the HEP has become one of the main driving forces of economic development, whose pursuit of work and living places involves various social aspects such as the environment, transportation, job opportunities, and even housing prices and compulsory education. The main objective of this paper is to make a quantitative analysis of the factors influencing the HEP. This study can provide important reference for different regions to make economic planning or HEP Attraction Plan.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported and funded by the National Key Research and Development Program of China (grant nos. 2016YFC0401404 and 2017YFB0503005) and the Strategic Priority Research Program of the Chinese Academy of Sciences (grant no. XDA23100301).