Street greening, an indispensable element of urban green spaces, has played an important role in beautifying the environment, alleviating the urban heat island effect, and improving residents’ comfort. Vegetation coverage is a common index used for measuring street greening. However, there are some shortcomings in the traditional evaluation methods of vegetation coverage. Part of the vegetation coverage cannot be determined from a two-dimensional perspective, such as shrubs and green walls. In this paper, the Sentinel-2 image was used to extract the street fractional vegetation cover (SFVC) and the Baidu street view panoramas were used to extract the green view index (GVI). To overcome the lack of a single perspective from the street vegetation coverage evaluation, the above two indices were merged to construct a comprehensive street greening evaluation index (CSGEI). The research area is the Longhua District of Shenzhen city in Southern China. All three indices were divided into five classes using natural breakpoint methods based on previous research experience. The results showed that Baidu street view panoramas could effectively identify shrubs and green walls that were deficient in the Sentinel-2 image. The GVI is a supplement to the street vegetation coverage. The SFVC and GVI were divided into five classes, from L1 to L5 implying a gradual increase in the percentage of the vegetated area. The result has shown that the SFVC was in the L1, accounting for 53.68%. After index merging, the process of accounting for the L1 decreased to 31.29%. The multiperspective integrated CSGEI could comprehensively measure the distribution information of street greening and guide the planning and management of urban green landscapes.

1. Introduction

A forest is an ecosystem of trees and countless forms of biodiversity [1]; rapid urbanization and land-use changes near cities have led to changes in the forest structure and composition [2]. Therefore, the urban forest has an influence on the forest near cities. As an essential element of the urban forest, street greening has the functions of purifying the air, dividing traffic routes, preventing fire, and beautifying cities [3]. Street greening is very important to improve citizens’ satisfaction with their living environment and promote sustainable urban development [4]. Urban street green spaces are divided into four functions—beautification function, ecological environment function, leisure activity function, and landscape culture function. With these functions, urban street green spaces directly or indirectly provide all city-related services [5]. Urban street green spaces have important relationships with mental health, air pollution, and travel behavior [68]. From multiple perspectives, therefore, the qualitative and quantitative analyses of city street greening are of great significance.

With the development of high-resolution remote sensing technology, images, which contain rich information on ground objects and complex spatial relationships, are characterized by high dimensionality, high resolution, and large amounts of data. Thus, remote sensing technology has become a method for extracting urban street green spaces [9]. The vegetation index is calculated by the linear and nonlinear combination of multispectral data obtained by remote sensors. Different vegetation indices, such as the ratio vegetation index (RVI), difference vegetation index (DVI), and normalized differential vegetation index (NDVI), are obtained by different combinations of measured values in different bands. The vegetation index is a simple and effective algorithm for quantitative and qualitative evaluations of vegetation cover, vitality, and growth dynamics, among other applications [10]. The vegetation index plays an important role in vegetation extraction and monitoring [11]. Compared with traditional urban green space measurement methods such as field surveys, questionnaire surveys, and statistical analysis, the vegetation index has the characteristic of high efficiency. Additionally, the problem of the high cost of LIDAR data and 3D laser point cloud data [12, 13] will be avoided. The NDVI is a better index of vegetation growth status and vegetation coverage factors, but it is easily saturated and has low vegetation coverage area noise problems such as incompleteness. Fractional vegetation cover (FVC) is generally defined as the percentage of the vegetation vertical projection area to ground cover in the observation area. It is one of the essential indices describing the surface vegetation cover [14]. It is a quantitative description with the combination of the NDVI and pixel dichotomy model. FVC plays a vital role in vegetation monitoring and ecosystem change [15]. The high-resolution remote sensing images and other ground observation data have been applied to the extraction of large-scale urban green spaces, such as urban forests and green space parks. Although the resolution of the images is constantly improving, there is still a problem of low accuracy in the extraction of single-plant vegetation on both sides of the street. It is impossible to observe green walls, shrubs, and lawns under vegetation coverage.

The concept of the green view index (GVI) originated in Japan, and it is a physical quantity used to measure the level of urban greening. It became one of the conventional greening evaluation indices identified by the Japanese government in 2004 [16]. Unlike vegetation cover, GVI as a pedestrian perspective measure of greenery has been widely used in various fields, such as urban traffic, socioeconomics, and residents’ health [1720]. With the development of the Internet, street view images appear in public view. Street view images from Google, Baidu, and Tencent [2126] have been used to study the spatial changes in urban streets. Street view images have the characteristics of wide coverage and directly reflect urban facade information. Thus, it is an important data source for extracting the GVI [27]. The efficiency and accuracy of GVI extracted from street view images in measuring street greenery has led more and more scholars to study GVI [28]. Traditional extraction of GVI has been divided into manual overlay, HSL color space, and interband calculations. Semantic segmentation refers to a process of assigning a semantic label (e.g., car and people) to each pixel of an image [29]. With semantic segmentation achieving good segmentation results in all fields, the extraction of the GVI has changed from traditional extraction [30, 31] to semantic segmentation [32]. Chen et al. [33] carried out research on the GVI and vegetation coverage extracted from remote sensing images, which realized the multidimensional observation of urban street greening; the NDVI, leaf area index (LAI), and GVI are used to evaluate street greening from districts and blocks. Kumakoshi et al. [34] put forward the standardization green view index (sGVI), combined with NDVI, which was used to analyze the greening distribution. Based on spatial domain interpolation, Cao et al. [35] combined street view images with aerial images to realize urban land use classification. Yu et al. [36] carried out a study on urban street greening; it had found that the correlation between the GVI and NDVI was reduced with the increase in buffer radius. However, the method to measure greening from a vertical view, such as the vegetation index, could only describe the greening from one description. Using GVI to measure greening in a horizontal view, it is possible to describe the greening around the sample points. The LIDAR data and 3D laser point cloud data are more suitable to measure the three-dimensional greening; this is only used in a small study area because of the high cost. Nevertheless, previous studies have considered only the correlation between the GVI and other vegetation indices; it has shown that the correlation was decreased with the distance. But it has not yet built an urban street vegetation three-dimensional observation model to quantitatively describe the street greening from multiview.

Accordingly, street view image and remote sensing data complement and verify each other in the scope, scale, precision, and dimension of greening quantification. This research innovatively proposed a comprehensive street greening evaluation index (CSGEI) to measure street greening from a multiview. Therefore, the impact on forest management of the near city’s urban street green space could be explored.

2. Study Area and Materials

2.1. Study Area

Shenzhen, as an important economic and political center in the Guangdong-Hong Kong-Macao Greater Bay Area, is one of the fastest-growing urbanized areas in South China. Figure 1 shows the location of the Shenzhen Longhua District. Shenzhen is located in a low-latitude area with a typical subtropical marine climate, abundant rainfall, mild climate, and long hours of sunshine, so the impact on the green vegetation with the change of seasons is less. It has an average altitude of 70 to 120 meters above sea level. The annual mean temperature is 22.3°C, the maximum temperature is 36.6°C, and the minimum temperature is 1.4°C. The rainy season is from May to September every year, and the average annual rainfall is 1,924.7 mm. Therefore, it is a suitable location to analyze urban street greening by using street view images. Longhua District is located in the geographical center of Shenzhen and is the central axis of Shenzhen’s development. As a large industrial district, its total area is 175.6 km2. Moreover, it is located south of the Tropic of Cancer. In recent years, Longhua District, with the goal of “a modern, international and innovative new city with the central axis,” has continued to improve the quality of its ecological environment and built the first “Talent Greenway” demonstration section in China. The Longhua District of Shenzhen city is typical of Chinese cities due to its high-density urban construction space and green space.

2.2. Materials
2.2.1. Baidu Street View Panoramas

As a kind of data storing spatial information, street view images emphasize human perception while expressing the local characteristics of the streetscape. In contrast to the top-down observation of remote sensing images, street view images quantitatively measure the effect of street facades [37]. The street view images are electronic maps based on the actual landscape. These maps provide rich and extensive street view images containing a wealth of information about city streets. As a result, street view images have become important data for assessing the visual perception of city streets. In most cities in China, the streets have already been covered by the street view and the Baidu street view panoramas with high coverage were selected for the data in this study. The Baidu Map Street View metadata application programming interface (API) stores panoramic images covering street sites at 360° horizontally and 180° vertically, and it is freely accessible online to everyone. To depict the street greenery in the study area, the road network data of Longhua District were downloaded through OpenStreetMap (OSM) and the centerlines of the road network were extracted in ArcGIS. This study used the Baidu Map Street View API to download Baidu street view panoramas, and sampling was carried out with an interval of 50 m in the centerlines of the road network. The Baidu street view panorama is a 360° surround image generated by stitching together. The pictures are taken by horizontal and vertical cameras, as shown in Figure 2. The direction indicated by the arrow is the forward direction, and the 360° surrounding image around the simulated person has been formed at the observation point. A total of 7,466 Baidu street view panoramas were downloaded in this study, and Figure 3 shows the distribution of sampling points of Baidu street view panoramas.

However, due to the distortion characteristics of the panoramic images, a python program has been used to extract the part of the panoramic images which is equivalent to the pedestrian’s viewpoint with low distortion. By this method, the cropping of 7466 images was completed in just one day. The Baidu street view panoramas provide the longitude, latitude, and date of images. Because the image is updated every 2–3 years, the Baidu street view panoramas used in this study were taken from 2013 to 2019. The images collected in 2017–2019 were used as the primary data, and other years were used as supplementary data.

2.2.2. Sentinel Data

Sentinel-2 is the 2nd satellite launched in the Copernicus program, with high resolution, wide mowing width, and a short revisit period. It has a good advantage in global change monitoring and the analysis of emergent events [38]. Sentinel-2 carries a multispectral imager (MSI) covering 13 spectral bands with ground resolutions of 10 m, 20 m, and 60 m. The range is from visible to near infrared to shortwave infrared, with different spatial resolutions. Among the optical data, Sentinel-2 is the only satellite with three wavelengths in the red range for effective monitoring vegetation information. Level-2A data of Sentinel-2 were downloaded on the Google Earth Engine (GEE) platform. Statistically, the Baidu street view panoramas were from 2013–2019. To reduce the error when fusing the two types of data, the image was selected when it contained less than 5% clouds between 2013 and 2019. The data in October 2018 were used for measuring urban street greening from the vertical perspective.

3. Methodology

The image of Sentinel-2 was used as the horizontal data. The street fractional vegetation cover was extracted by the combination of the NDVI and image dichotomy. The threshold of vegetation and nonvegetation was automatically determined using the Otsu method. By this method, the oversaturation problem of the NDVI is eliminated. The proportion of vegetation pixels in the buffer was calculated. The width of the buffer was determined by the road classes. The street fractional vegetation cover (SFVC) could be calculated. The Baidu street view panoramas were used as the vertical data. The semantic segmentation of the FCN-8s network was used to extract the GVI. The two indices were graded into five categories. The comprehensive street greening evaluation index (CSGEI) was constructed by fusing the indices. The research route of this paper is shown in Figure 4.

3.1. Extraction of Fractional Vegetation Coverage Based on the Otsu Method

The near-infrared wavelengths contained in the remote sensing images have higher reflectivity and absorptivity to vegetation. NDVI was calculated from the NIR band and the R band of the Sentinel-2 to quantify the urban green space in this research. Before NDVI calculation, the Sentinel-2 should be preprocessed such as atmospheric correction and radiometric calibration. The NDVI is the spectral information of ground objects received by remote sensing sensors to reflect the condition of surface vegetation. In this paper, the FVC was estimated quantitatively based on the NDVI. The NDVI value of a pixel is expressed as the information contributed by the green vegetation part and the uncovered (bare soil) part. Therefore, the formula for calculating the FVC using NDVI is shown in equation (1) as follows:

where NIR is the near-infrared band and is the red band, NDVI is the normalized differential vegetation index, with a value between −1 and 1, is the NDVI value of the area completely covered by bare soil or no vegetation, is the NDVI value of the area completely covered by vegetation, and FVC is the fractional vegetation cover.

For most types of land, represents the theoretical value of the bare soil surface and is the maximum value of the entire vegetation image. However, the value of changes with time and space, so and cannot be chosen as fixed values. and are determined by the maximum and minimum values of a given confidence interval. By analyzing the Sentinel-NDVI data and considering the actual condition of vegetation cover in the study area, the NDVI value corresponding to a frequency of 5% was taken as ; it is in the annual maximum synthetic NDVI frequency accumulation table, and the NDVI value with a cumulative frequency of 95% was taken as .

The Otsu algorithm [39] is an adaptive threshold segmentation algorithm based on the principles of probabilistic statistics and was proposed by the Japanese scholar Zhenyuki Otsu in 1979. The basic idea is to divide the image gray value into two parts: background and target. The gray value that maximizes the variance between classes is selected as the optimal threshold. If the variance between classes is more significant, the probability of misclassification of the two classes is more negligible. Due to the excellent segmentation effect, it is widely used in image thresholding. The optimal FVC threshold was selected based on the Otsu algorithm to achieve street vegetation information extraction in the Longhua District.

In this paper, the sampling points were taken as the center of the circle to create the buffer zones and it was affected by the road width of different road classes. The maximum width of a single lane and the number of two-way vehicle lanes were taken as the reference to establish a buffer zone of sampling points. The maximum width of the vehicle lanes and the radius of the buffer area are shown in Table 1. Since the buffers were created according to the width of the vehicle lanes, for lanes less than 25 m wide, 25 m was used as the minimum buffer radius. (Because there is partial scenery overlap between two adjacent Baidu street view panoramas, 1/2 of the sampling interval was selected as the minimum buffer radius. The reason is to ensure that the vertical vegetation between the two adjacent sampling points in the Sentinel-2 image was the same as that in the Baidu street view panoramas.) Based on the percentage of vegetation in the buffer area, the street fractional vegetation coverage (SFVC) was constructed. The SFVC calculation formula is shown in equation (2). where SFVC is the street fractional vegetation coverage at the sampling points. is the number of green vegetation image pixels in the buffer zone. is the total number of image pixels in the buffer.

The SFVC was classified into five levels according to the natural breakpoint grading method, from L1 to L5 which means that the area of vegetation in the buffer zone is gradually increasing. The classification criteria are shown in Table 2.

3.2. Extraction of the GVI Based on FCN-8s

A good GVI attracts pedestrians, and the primary factors affecting urban street greening are the canopy size, the type of trees, the arrangement of street trees, and the arrangement of plants in the pedestrian path. This research used the Baidu street view panoramas to extract vegetation for GVI calculation. Because the street view images only have three bands: red, yellow, and blue, it is impossible to accurately separate vegetation from artificial green. It is still difficult to extract green vegetation from street view images quickly and accurately. Semantic segmentation is an advanced image pixel classification method that divided the image into several parts (e.g., buildings, sky, and greenery) by pixels and needs. It extracts numerous elements from the Baidu street view panoramas. The combination of the Baidu street view panoramas and semantic segmentation has also been described as the street green landscapes from a pedestrian perspective. A fully convolutional network (FCN) is based on a convolutional neural network (CNN) that removes the fully connected layer and adds the deconvolution layer while proposing the idea of a jump structure to solve the image semantic segmentation problem. The FCN-8s was used for semantic segmentation of the Baidu street view panoramas. It is more efficient and avoids the problem of double calculation and wasted space caused by using the neighborhood. Previous studies have shown that FCN performs well in street view image segmentation [40, 41]. FCN-8s has 5 convolutional layers. Unlike traditional convolutional layers that use only large convolutional kernels for one convolution, each convolutional layer of FCN-8s uses small convolutional kernels for multiple convolutions. The FCN-8s network used in this paper was trained on the ADE_20K dataset, which performed well in the Pascal visual object class. The setting of FCN-8s is shown in Table 3 [42]. In combination with the Baidu street view panoramas, it could predict the semantic properties of each pixel in the image.

The Japanese scholar Yoji Aoki proposed that when the GVI is higher than 25%, pedestrians have a good feeling about the greenery of the street. When the GVI is higher than 50%, pedestrians have the psychology of very splendid greenery. Thus, Natsuhi Origahara divided the GVI evaluation into five classes, as shown in Table 4 [43]. From L1 to L5 means that the green vegetation is becoming more and more intense for pedestrians. In recent years, the GVI has been widely used in various aspects of street greening calculation and evaluation. Due to the limitation of the sampling angle and sampling amount, the Baidu street view panoramas were used directly to extract and measure the GVI. Therefore, the calculation of the GVI was slightly different. The calculation of the GVI in this paper is shown in equation (3) as follows: where GVI is the green view index of the sampling points, is the total number of green vegetation pixels in Baidu street view panoramas, and is the total number of image pixels in the Baidu street view panoramas.

3.3. Constructing the Comprehensive Street Greening Evaluation Index

This paper obtains the distribution map of street vegetation coverage in the buffer zone based on Sentinel-2 data. However, the green vegetation on the walls and the lawn under the tree canopy cannot be reflected by remote sensing images. Using the Baidu street view panoramas as the data extracts the GVI, the vegetation distribution of the sampling points could be described from the pedestrian’s perspective. The combination of the vegetation coverage extracted from the Sentinel-2 data essentially compensates for the disadvantage of the lack of vegetation from a single viewpoint. The SFVC at the Baidu street view panorama sampling points was analyzed with the GVI by Pearson’s correlation coefficient; the calculation formula of the Pearson’s correlation is shown in equation (4) as follows: where is the number of sampling points (7466), is the GVI value at the th sampling point, is the SFVC value at the th sampling point, and are the values of two variables (GVI and SFVC), and is Pearson’s correlation coefficient between the GVI and the SFVC.

When >0, the two indices are positively correlated. In contrast, the two indices are negatively correlated and the larger the absolute value of is, the stronger the correlation between the two indices is. The correlation was calculated to be 0.52, which indicates a weak correlation. Based on the above knowledge, the GVI was fused with the SFVC for rank classification and the original data were converted to between [0,1] based on the maximum-minimum normalization formula to construct the CSGEI. The expressions are shown in equation (5) as follows: where is the level of GVI at the sampling points, is the level of SFVC at the sampling point, and CSGEI is the comprehensive street greening evaluation index.

Therefore, the SFVC and GVI were divided into five levels, from L1 to L5, and the value of the level was an integer ranging from 1 to 5. The higher the level is, the better the vegetation greening effect is. The final CSGEI value ranges were from 0 to 100.

4. Results

4.1. Analysis of the Distribution and Characteristics of SFVC

The combination of the NDVI and pixel dichotomy was used to extract the fractional vegetation coverage of the Longhua District. The threshold value of FVC was determined by the Otsu method as 0.48. The result of the fractional vegetation coverage of the Longhua District is shown in Figure 5. Constructing the buffer range of sampling points of different road levels, the SFVC of sampling points was obtained. The result of the SFVC index was spatialized and displayed in Figure 6. The street fractional vegetation coverage in Longhua District was inadequate, and the SFVC index in L1 accounts for 53.68%. The lowest street SFVC index was found for Tao Yuan road, whose average SFVC was 0%, but the GVI for Tao Yuan Road was 10.91%. Huan Guan Nan road, with the lowest street SFVC, had a mean SFVC of 4.40%, and the SFVC index was as high as 93.24% for Yang Tai Shan Greenway with a high street GVI. Thus, the SFVC of this vertical angle is not able to identify the features under the large vegetation cover, such as roads and houses.

4.2. Analysis of the Distribution and Characteristics of GVI

The FCN-8s network was used for the semantic segmentation of 7466 Baidu street view panoramas. The network model of FCN-8s obtained 81.44% accuracy on the training dataset and 66.83% accuracy on the test dataset. To further verify the accuracy of the semantic segmentation extraction results, 100 images were randomly selected in this paper and green vegetation was extracted by three methods: manual recognition, K-means algorithm, and FCN-8s semantic segmentation. The manual recognition used Photoshop’s magic to extract the green vegetation. Two evaluation metrics were used to compare green vegetation extraction: the mean pixel accuracy (MPA) was used to measure the proportion of pixels correctly labeled as vegetation to the total pixels, and the formula is shown in equation (6). The mean intersection over union (MIoU) was used to measure the accuracy of vegetation pixels being correctly labeled, and the formula is shown in equation (7). These measures are usually used to evaluate the accuracy of classification results. Table 5 shows the accuracy of the semantic segmentation results compared with the reference data. Figure 7 shows the comparison between the results of semantic segmentation and the other methods’ results. For the scattered branches and leaves of trees, there was a misclassification at the edge of the segmentation results. The K-means algorithm has lots of misclassification in buildings and vegetation. The red boxes in the figures represent the locations where the two segmentation results were different. In general, the segmentation method used in this study meets the needs of this experiment and saves time for manual recognition. where is the number of images, is true positive identified vegetation pixels of image , is true negative rejected vegetation pixels of image , is false positive identified vegetation pixels of image , and is false negative rejected vegetation pixels of image .

The analysis of the GVI among the streets in Longhua District showed that the minimum value of the street GVI was 0, while the maximum value was as high as 82.66%. The difference in the value of the GVI among the streets was noticeable. (Streets with fewer than 10 sampling points were not analyzed in this paper.) The GVI results of Longhua District are displayed spatially, and the result is shown in Figure 8. The street with the highest GVI value was Cui’an road, whose average GVI value was as high as 50.83%, followed by those of Yangtaishan Greenway, South Sili road, Lanjing road, and Kesheng road. However, the overall green distribution of the Yangtaishan Greenway was higher than that of the Cui’an road. It provides a good visual experience for cyclists as a cycle path. The lowest GVI value was found for Shiqing avenue, with an average street GVI of 5.42%. The street GVI distribution in Longhua District showed a low middle and high periphery. It was influenced by the greenways covering the southwestern part of the study area, where the GVI was higher than that in the city center.

4.3. Analysis of the Distribution and Characteristics of the CSGEI

The normalized CSGEI in this paper achieved the observation and analysis of street greening in Longhua District from multiple perspectives. In this paper, the normalized CSGEI was divided into five classes; the result is shown in Figure 9. From L1 to L5 means that the distribution of street vegetation becomes better and better. The statistical analysis of the three indices of each street showed that the GVI was higher than SFVC on the following roads: Guanle road, Si Li south road, Lanqing first road, Longhua Square second road, Jinlong road, and Feng Guan road. The SFVC is higher than the GVI in the Minglang road, Yangtaishan Greenway, Nanping Express, Crushed Stone road separate interchange, and Fulong road. The CSGEI integrated the two indices and avoided the disadvantage of generalization in the description of street greening.

The index after integration showed that the comprehensive greening of Yangtaishan Greenway, Cui’an road, Lanjing road, Kesheng road, and Minglang road was higher and the difference between the two indices was smaller on the roads of Cui’an road, Lanjing road, and Kesheng road. The greening distribution of Longhua District showed the phenomenon in which low greening was located at the center and high greening was located in the periphery. Figure 10(a) shows the distribution of the SFVC index for each class of roads. It has been seen from the figure that for urban secondary roads, urban main roads, and urban feeder roads, which are widely distributed in the inner city, the SFVC index at the lower level accounted for a higher proportion. The reasons for the above phenomenon are attributed to two points: one is due to scattered vegetation planting in the inner city, and the other is that the resolution of Sentinel-2 is low resulting in a single piece of vegetation failing to be identified by Sentinel-2. As shown in Figure 10(b), the different levels of the GVI for urban trunk roads, urban secondary roads, and urban feeder roads were evenly distributed, while L5 of the GVI cycle path accounted for more than 80%, which was in line with the characteristics of the cycle path design. According to the road network data of OSM to study the streets in Longhua District, the sampling points were divided into five categories. As shown in Figure 10(c), the CSGEI for service and entertainment of the cycle path was higher. It could provide a good visual sensation to pedestrians, and it was also conducive to maintaining the ecological environment. According to the correlation analysis of the GVI and SFVC index among different classes of roads, the lowest correlation coefficient was 0.38 for the cycle path, followed by 0.39 for elevated and expressways and 0.64 for urban feeder roads, and the overall correlation coefficient was low. This paper used a random sampling method to evaluate the CSGEI. The result is shown in Figure 11.

5. Discussion

The multiview data from the horizontal and vertical view has provided a new direction for extracting urban street greening. Different visual effects in green perception could be obtained from two azimuth data; it is helpful to realize the quantitative analysis of urban street greening. Considering that the grade of the road was divided into five classes, three pictures of five kinds of classes were randomly selected to verify the CSGEI. Combining the pictures and data analysis, it was seen that the GVI was higher than the SFVC index in the greening of urban secondary roads, urban main roads, and urban feeder roads. Figure 11 shows that the vegetation distribution of urban secondary roads was dominated by rows or a single vegetation distribution on both sides of the street. This distribution resulted in the vegetation coverage extraction from Sentinel-2 data having the phenomenon of missing fractions affected by the resolution. The SFVC index in the elevated and expressway was significantly higher than that in the GVI. The same phenomenon was observed for the cycle path, where the SFVC index of the cycle path was higher than the GVI by as much as 95.83%. The street vegetation coverage extracted from Sentinel-2 data covered the ground objects under the trees due to the shading of trees. As a result, the description of two-dimensional greening was more than the actual description. The 53.68% SFVC was at the L1 level which was less than 9. 02% and almost all urban main roads and urban secondary roads. The GVI extracted from the Baidu street view panoramas could effectively identify shrubs and green walls and complement street vegetation coverage. An SFVC in L1 showed a significant increase of approximately 76.67%. After the indices merged, the process of accounting for L1 decreased to 31.29% and the distribution of street greening was in line with the actual situation.

The result has been compared with similar studies [26, 36, 44]. Similarly, a correlation has been found between the GVI and the NDVI in this paper. Different from the previous study, the sample buffer determined by road width was used as the unit to extract the NDVI. The Sentinel-2 is much easier to obtain than LIDAR. Another dimension, the vertical view, has been added to the measurement of urban street green space. The CSEGI could achieve the quantitative evaluation in the urban street space. In addition, it has been confirmed that semantic segmentation is well used in extracting street view images.

However, there are still shortcomings in this paper. For horizontal street greening extracting, the resolution of Sentinel-2 is 10 m and it is not able to recognize the young trees. For vertical street greening extracting, there is no guarantee that street view data was acquired on the same day. In addition, the time span of the Baidu street view panoramas cannot guarantee the same acquisition time as Sentinel-2. The evaluation of street greening at multiple scales from the point, line, and polygon will be realized in future studies.

6. Conclusions

This study used a combination of Baidu street view panoramas and Sentinel-2 to extract and analyze the three-dimensional greening information of streets. The results showed that the Baidu street view panoramas and Sentinel-2 could provide two perspectives of street greening distribution. The combination of the two kinds of data was an effective method for evaluating and analyzing street greening. The higher polarization of SFVC compared to GVI is mainly due to the low resolution of Sentinel-2 images and the poor uniformity of vegetation distribution on both sides of the street. The spatial distribution of the GVI showed the characteristics of being low in the middle and high in the periphery due to the buildings and sidewalks in the central city, which occupy part of the greenery space. The CSGEI evaluated street greening from two perspectives, reducing the workload of a large-scale questionnaire survey. Combined with the Sentinel-2 image, this approach compensated for the deficiency of single-angle observations of vegetation in the Baidu street view panoramas and the inconsistency of the Baidu street view panorama acquisition time. The results showed that street view images could identify lawns, shrubs, and roads under the tree canopy in remote sensing images. Remote sensing images could be used to describe the lateral growth of trees in the Baidu street view panoramas to objectively evaluate the greening level of the street at the observation point. The CSGEI proposed in this paper could merge the two perspectives of street greening evaluation levels. As the street view images are updated, the latest data will be used to measure the urban green space, enabling monitoring of the three-dimensional changes in street greening. The index helps to identify the streets lacking greening and to formulate corrective measures in a targeted manner. It would provide important references for street greening planners.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.


This work was supported by the Education Department of Liaoning Province Fund (no. lnfw202013), National Natural Science Foundation of China (4210010785), and Humanities and Social Science Foundation of the Ministry of Education (CN) (21YJC790129).