Artificial Intelligence for Wireless Communications and Control NetworksView this Special Issue
Correlation Analysis between Atmospheric Environment and Public Sentiment Based on Multiple Regression Model
In recent years, with the gradual improvement of the public’s demands for ecological and environmental services, changes in environmental quality, especially atmospheric environmental quality, have been highly concerned by the public. Public environmental emotions belong to psychological space information. How to quantify the changes in public environmental emotions caused by changes in environmental quality, comprehensively analyze the atmospheric physical and chemical factors that have a key impact on public environmental emotions, and achieve quantifiable and predictable public environmental emotions are difficult points of current research in the field of public environment. Based on the public participation perception method, this paper proposes a public environmental sentiment prediction model based on the analysis of the relationship between atmospheric environment and public environmental sentiment by using the collected public environmental satisfaction data. Taking the data of a city as an example, using atmospheric environmental factors and public environmental satisfaction, a multiple regression model (OLS) was established, and PM2.5, PM10, temperature (TMP), and humidity (HUM) were used as key factors to conduct Pearson correlation analysis with public environmental satisfaction. The results showed that PM2.5 and PM10 showed a strong negative correlation with public environmental satisfaction (-0.82 and -0.67), while TMP and HUM showed a weak positive correlation with public environmental satisfaction (0.3 and 0.19). Therefore, reducing the concentration of PM2.5 and PM10 in the city has a positive effect on improving public environmental satisfaction.
In recent years, with the continuous acceleration of China’s industrialization process, the utilization of resources and productivity has been significantly improved. However, the resulting ecological and environmental problems have become increasingly prominent, making people’s normal living environment constantly disturbed . People’s physical and mental health has been affected to varying degrees through the continuous interference of various comprehensive ecological and environmental problems for a long time . As a result, deteriorating environmental issues have now become a pain point for the public. Environmental issues have also become the social issues that the public is most concerned about, and they also affect people’s psychological feelings and behavioral changes to the environment all the time [3, 4]. If we want to do in-depth research on environmental issues, we must start from the perception of the public after being affected by the environment.
The so-called perception is the response process of external information by people’s feeling, attention, and perception. It mainly describes the consciousness and feeling reaction in the human brain after the objective affairs pass through the human sensory organs [5, 6]. Only with perception can people form the premise of understanding the external environment. Perception is also an important reference to guide individual behavior or social group behavior . When a person is under a certain environmental area, the sensory organs will obtain the corresponding sensory information through the environment in which the person is located and then process the perceived sensory information in the mind and then give the specific evaluation of environmental protection management, environmental quality, and environmental behavior [8, 9].
Under the influence of the ecological environment, the conscious feeling and image formed by the human’s perceptual organs after processing and analyzing the corresponding environmental feeling information in the brain is the environmental perception . And due to certain differences in the subject of the perception object, the content of perception is also different. People have been exposed to constantly changing ecological environment areas for a long time so that people’s perception of their own environmental areas is also accurate and changing at any time. Environmental awareness is the basic psychological guidance for people’s environmental awareness and behavior [11, 12]. Whether the correct environmental awareness behavior can be formed depends on whether the environmental awareness behavior executor has the correct environmental awareness .
Environmental emotional relationship analysis is mainly to analyze the relationship between the main variables that affect environmental emotional changes and environmental emotions, which is basically the same as the general relationship analysis method . Relational analysis refers to a method of using two or more random variables with the same characteristics (same level) to perform statistics and analysis by relying on their dependencies .
At home and abroad, the current research on relational analysis is mostly divided into static relational analysis and dynamic relational analysis. As the dominant factor of social emotion, public environmental emotion is a difficult problem in the current public environment field to realize the prediction of public environmental emotion. Most scholars at home and abroad are predicting some factors of air quality itself [16, 17]. For example, some scholars have pointed out that the air pollution level in Macau often exceeds the level recommended by the World Health Organization as the main problem. In order to make people take preventive measures and avoid high pollutants, further health risks under exposure, statistical models based on linear multiple regression (MR) and classification and regression tree (CART) analysis were built to predict the NO2, PM10, PM2.5, and O3 concentrations on the second day. Some scholars have used weather research and forecast models together with the Chemistry/Data Assimilation Research Testbed (WRFChem/DART) chemical weather forecast/data assimilation system and multicomponent data assimilation to study the improvement of air quality forecasts in eastern China. Some scholars have used a simplified Lagrangian particle dispersion modeling system and a Bayesian and multiplicative correction optimization (Bayesian-RAT) method to evaluate the mixing ratio prediction of PM10 and PM2.5, thereby evaluating the atmospheric particulate matter (PM) concentration prediction regional scale. Some scholars use Box-Jenkins’ modeling theory to establish an ARIMA model that conforms to the changes in PM2.5 concentration in the study area to predict the PM2.5 concentration value in the next four days [18, 19].
Existing forecasts for atmospheric environmental factors appear to be too single, often using models to learn the value of a single time series variable, and there are few related studies on building public environmental sentiment prediction models using model ideas under deep learning [20, 21]. This paper studies the thinking and participatory perception method based on social perception computing, uses the public environmental sentiment data collected by the platform to measure the public’s subjective environmental satisfaction, and establishes a multiple regression model and related analysis model to analyze the relationship between public environmental satisfaction and atmospheric environment in order to compare the influence of various atmospheric environmental factors on public environmental sentiment.
2. Method and Theory
2.1. Data Source
This paper will take Baiyin City as an example to comprehensively analyze the relationship between atmospheric environmental factors and public environmental satisfaction in the city and use the analysis results as the data and theoretical basis to build a public environmental sentiment prediction model, which will help the government to manage environmental problems and improve the regional human-land relations. This paper uses a questionnaire survey to collect data. In the early stage, the public environment perception data was scored directly by users (percentage system). This paper collects data on “public environmental satisfaction” from thousands of local citizens, with a total of 40,000 pieces of data, with a time span of one year . Comprehensive analysis shows that there are different age groups, different genders, different educational levels, and different occupations among the users who submit the data, and the data is locally representative. The atmospheric environmental data in this chapter includes air pollution indicators and meteorological indicators, which correspond to public environmental satisfaction in time and have the same sample size, totaling 40,000 items. The atmospheric pollutants in the atmospheric environment data include AQI, PM2.5, PM10, SO2, NO2, O3, and CO, which are derived from the hourly data of atmospheric pollutant concentrations at urban detection sites. The meteorological factor data includes FEELST, temperature (TMP), humidity (HUM), and wind speed (WINDSP), and these data are all from the China Weather Network.
2.2. Research Methods
In this study, the multiple linear regression model was used to initially screen the atmospheric environmental factors, and then, the Pearson correlation analysis was used to analyze the correlation between the screening results and public satisfaction.
2.2.1. Reverse Elimination Rule for Multiple Regression
The main idea of ordinary least squares (OLS) is to use the parameter estimation of linear regression and use the square sum of the difference between the actual sample value and the OLS estimation as the main reference parameter estimation value (by minimizing the square of the error to find the best variable match for the data). A method similar to the reverse elimination principle of the OLS model is principal component analysis (PCA). An excellent model should cover as much information as possible with as few features as possible. The way PCA learns from data is unsupervised learning. Therefore, the response variable in the data does not participate in the construction of the guiding principal components. PCA cannot guarantee a good interpretation of the direction of the predictor variables, and there are limitations in the extraction of the main variables. OLS is an alternative to supervised PCA. It uses multiple principal components as a new variable set and performs least squares regression on this basis. Therefore, the corresponding variables play the role of adjusting the parameters of the principal components, which can eliminate the drawbacks of the PCA method.
The process of constructing a multiple regression model using OLS is as follows. (1) Suppose there is a regression model, as shown in
Assuming that the data in the sample has groups of data: , , the least squares method is used to calculate the regression coefficient between each group of data variables and is recorded as 𝛽𝑖, and the estimated value of the regression coefficient is recorded as , ; the calculation process of the square of the error is shown in
The input of the OLS model consists of four parts, which are the dependent variable, the independent variable, the missing item, and the constant item. For the construction of the multiple regression model, the first two parts are considered first. The first is the dependent variable, which acts as the response variable in the multiple regression, and the input data in the regression model assumed above is an array of length . The second is the independent variable, which acts as the regressor in the regression model and in the OLS model. In the beginning, it is not assumed that the multiple regression model has a constant term, so the assumed regression model is shown in
In the research data, the value of is set to 1 for all , whereby the input of the independent variable becomes sets of data. In stepwise regression under linear conditions, the data samples are analyzed to determine which combinations of independent variables can be used to explain the largest variance in the dependent variable data, and then, the data combinations of these independent variables are retained. The principle of reverse elimination is used in the research method. First, all variables (air pollution indicators and public environmental satisfaction, meteorological indicators and public environmental satisfaction) are put into the model, respectively. We delete the independent variable with the largest difference from the threshold and finally check whether the multiple regression model contains an effective explanation for the variance of the dependent variable. To sum up, in the process of eliminating the independent variables, it will be carried out iteratively until no parameters meet the elimination conditions, and the remaining independent variable combination has the maximum explanatory power for the dependent variable.
2.2.2. Pearson Correlation Coefficient
Pearson correlation coefficient (Pearson correlation coefficient) is a linear correlation coefficient, which can analyze two or more elements with correlation characteristics. Considering that it can reflect the degree of linear correlation between atmospheric environmental factors and public environmental satisfaction, the value of the correlation degree is usually expressed by or , also known as the “correlation coefficient.” In the Pearson correlation coefficient, the correlation coefficient between variables is expressed as the covariance between the studied variables divided by the standard deviation. For example, variable one is , variable two is , and the Pearson correlation coefficient between the two variables is shown in
The above formula defines the overall correlation coefficient of the two variables, represents the total number of data samples, and represent the average value of the two variable data, and and represent the standard deviation of the two variable data samples, respectively. Described by the above formula, in the calculation process of the Pearson correlation coefficient can also be regarded as the cosine value of the angle between the two sets of vectors (variable and variable ). The value of is between -1 and +1. If , the two variables are positively correlated; that is, the greater the value of one variable, the greater the value of the other variable. If , the two variables are negative; that is, the larger the value of one variable, the smaller the value of the other variable, and the larger the absolute value of , the stronger the correlation. It should be noted that there is no causal relationship here. If , there is no linear correlation between the two variables, but other forms of correlation (such as curves) are possible.
3. Results and Analysis
AQI (Air Quality Index) describes the freshness or pollution of the air and the impact of air pollutants on human health. The comfort index of the human body is a common expression method of human comfort in daily life, and the final evaluation of the index depends on the body temperature (FEELST) (Table 1). The body temperature mainly depends on the three indicators of TMP, HUM, and WINDSP. TMP is the main indicator for judging the human body’s perception of climate temperature, and HUM and WINDSP are auxiliary indicators.
3.1. Screening of Influencing Factors of Public Environmental Emotions
Taking air pollutant indicators and meteorological indicators as independent variables and public environmental satisfaction as a dependent variable, an OLS multiple regression model was constructed, and multiple regression analysis was performed on it. It can be seen from Table 2 the OLS model constructed by air pollution indicators and public environmental satisfaction; the model uses as the judgment threshold for reverse elimination and determines the two air pollution indicators PM2.5 and PM10. Table 3 shows the OLS model for pure analysis of PM2.5 and PM10. The experimental results show that the degree of fit is significantly improved. Tables 2 and 3 finally identify the strong influencing factors in the air pollution indicators, which are PM2.5 and PM10, respectively.
Table 4 shows the OLS model constructed by meteorological indicators and public environmental satisfaction. The model also uses as the judgment threshold for reverse elimination, and it is determined that the two meteorological indicators TMP and HUM have strong effects on public environmental satisfaction. For the explanatory power, the remaining indicators are eliminated. Table 5 shows the results of the OLS model that only contains TMP and HUM indicators. Compared with Table 4, the fitting accuracy of Table 5 is significantly improved. Finally, the strong influencing factors in the meteorological indicators are determined, which are TMP and HUM, respectively. (2) Correlation analysis of factors influencing public environmental sentiment. Based on the analysis results of the OLS model, a Pearson correlation analysis will be carried out between air pollutant indicators, meteorological indicators, and public environmental satisfaction. According to its constraints, the premise of the correlation analysis is that the distribution of public environmental satisfaction conforms to a normal distribution. The distribution of public environmental satisfaction is shown in Figure 1. The abscissa in the figure is the score of public environmental satisfaction, and the ordinate is the choice. The number of people with a certain value of public environmental satisfaction value. In view of the data distribution in Figure 1, the normality test is carried out on the public environmental satisfaction data, and the test results are shown in Figure 2. Combining Figures 1 and 2, the distribution of public environmental satisfaction conforms to a normal distribution. Next, Pearson correlation analysis can be performed on public environmental satisfaction.
The analysis results of PM2.5 and PM10 are shown in Figures 3 and 4. In Figure 3, the correlation coefficient between PM2.5 and public environmental satisfaction has reached -0.82, which has a strong negative correlation. PM2.5 has the most direct harm to the human body and can cause damage to the respiratory system. The nose and throat cannot prevent PM2.5 fine particles. They can enter the body’s bronchi, blood cells, and capillaries and finally spread to the entire blood circulatory system. Not only that, PM2.5 can also be used as a carrier to carry many harmful substances, such as bacteria, carcinogens, polycyclic aromatic hydrocarbons, and heavy metal particles. A large number of PM2.5 particles entering the lungs can block the local tissues of the lungs and affect the ventilation of local bronchial tubes. While PM2.5 has a huge impact on the public’s body, it also has a corresponding impact on the public’s “environmental satisfaction.” When the body is uncomfortable due to PM2.5, the satisfaction will inevitably decrease. Therefore, there is a negative correlation between them.
As shown in Figure 4, the correlation coefficient between PM10 and public environmental satisfaction reaches -0.61, and the correlation coefficient is also relatively high. Different from PM2.5, PM10 is based on floating dust, which is a substance that can float in the atmosphere for a long time. Because pollutants float in the atmosphere for a long time, it is easy to form long-distance transmission, which leads to further expansion of the pollution range and becomes the atmosphere, a place where various substances in the environment undergo chemical reactions. Due to the relatively large particle size of PM10, PM10 particles are easily deposited, and the concentration near the ground is the highest, and the concentration will decrease correspondingly with the increase of altitude. Although PM10 can be directly inhaled into the respiratory tract, some of it can be blocked by the villi in the nasal cavity and some can be excreted through excretion. Compared with PM2.5, PM10 is relatively less harmful to human health, which indirectly proves that PM10 has less negative impact on public environmental satisfaction than PM2.5.
The analysis results of TMP and HUM are shown in Figures 5 and 6. The correlation coefficients between TMP (temperature), humidity (HUM), and public environmental satisfaction are 0.3 and 0.19, respectively, with a weak positive correlation, but this correlation relationships are not to be underestimated. From the perspective of TMP, generally lower temperature is conducive to the formation of better environmental emotions and improves the public’s satisfaction with the current environment. However, at higher temperatures or when the temperature rises, the public’s environmental emotional state is prone to fluctuations or exception. In summer, the public is often prone to irritability. When the situation is serious, it can be called “emotional heatstroke.” It is a manifestation of emotional disturbance to summer weather. Most of them are caused by high TMP and HUM in regions due to long sunshine hours. At this time, their effect on the hypothalamus is significantly enhanced, affecting emotional regulation, so that emotions are easily out of control.
(1)This paper takes atmospheric environmental factors as the main object of research on the influencing factors of public environmental emotions and conducts experimental analysis on public environmental satisfaction from the perspectives of air pollution indicators and meteorological indicators. It is concluded that the public environmental satisfaction has a negative linear relationship with air pollution indicators and a positive linear relationship with meteorological indicators(2)Among the air pollution indicators, PM2.5 and PM10 play a key role in the negative impact of the public’s environmental satisfaction. Among the meteorological indicators, TMP and HUM play a key role in the positive impact of the public’s environmental satisfaction. The public’s environmental satisfaction obeys a normal distribution, indicating that most of the public’s evaluation of the environment is moderate. The public environmental satisfaction has a strong negative correlation coefficient with PM2.5 and PM10, and the correlation coefficients are -0.82 and -0.67, respectively. Public environmental satisfaction has a weak positive correlation with both TMP and HUM, with correlation coefficients of 0.3 and 0.19, respectively
The figures and tables used to support the findings of this study are included in the article.
Conflicts of Interest
The author declares that there are no conflicts of interest.
This work was supported by the Employment and Entrepreneurship Research Projects in Henan Province “the impact of solution-focused brief therapy on the study habits, employment pressure, and mental health of students in financial difficulties in medical colleges” (JYB2019141).
S. Phumsathan, “Environmental value orientation and environmental impact perception of visitors to Khao Yai National Park,” Kasetsart Journal of Social Sciences, vol. 34, no. 3, pp. 534–542, 2013.View at: Google Scholar
G. Corani, “Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning,” Ecological Modelling, vol. 185, no. 2-4, article 513529, 2005.View at: Google Scholar