Abstract

To study the classification and evolution of key technologies in the transportation field, the data of 36 authoritative SCI journals in the transportation field were collected from the Web of Science core collection database from 2001 to 2020. Based on the bibliometric method, this study used Python to process and visualize data, combined with bibliometric software VOSviewer to assist data visualization. Firstly, a preprocessing data algorithm was designed to deduplicate the collected data, merge synonyms, and extract key technologies. Then the paper records that contained the key technology lexicon were filtered out. Next, the annual number of publications and the distribution of key technologies over time were counted. The least squares method was used to fit the distribution of the annual proportion of the publications, and the slope k1 of the fitted linear regression equation was used to determine the research interest trend of key technologies. The key technologies were divided into “hot technology,” “cold technology,” and “other technologies,” according to the research heat trend. In order to further explore the research hotspots, the least squares method was also used to fit the citations of all technologies to obtain the slope k2. We use the Gaussian mixture model (GMM) algorithm to cluster k1 and k2 of each technology. As a result, the 144 technologies were divided into 13 super-key technologies, 60 key technologies, 59 relative key technologies, and 12 lower-key technologies. Then, the evolution of key technologies was analyzed from two perspectives of weighted evolution and cumulative evolution. And the technology evolution trend in the transportation field in the past 20 years was explored. Finally, the cooccurrence clustering method was adopted to divide key transportation technologies into five categories: vehicle technology and control, optimization algorithms and simulation techniques, artificial intelligence and big data, Internet of Things and computing, and communication technology. The research results can provide references for different people in the transportation field, including but not limited to researchers, journal editors, and funding agencies.

1. Introduction

The economy and society are in an unprecedented stage of super-rapid development, and subversive technologies are emerging. Burgelmanet defines technology as “technology refers to theoretical and practical knowledge, skills, and artifacts that can be used to develop products and services and their production and delivery systems” [1]. Technologies are the most productive means, permeated into all aspects of social production, promoting the transformation of various industries’ decisive factors. They have changed the way people travel, from walking and horse-drawn carriages to the first human steam car, planes, high-speed trains, subways, shared travel, and driverless vehicles. In addition, technologies will have a profound impact on the structure and operation mode of various transport and the rules of transport. Although it is difficult to accurately judge future transportation development trends, the evolution and maturity of research technology would significantly impact transportation discipline and industry. In this paper, we refer to the technology applied in the transportation field as transportation technology.

In terms of the application of technology, recurrent neural network technology is applied to GPS trajectory mining to reconstruct a complete public transportation network [2], promoting the development of public transportation. Multi-input and multioutput (MIMO) communication technology is used to solve communication capacity in rail transit [3]. Cloud computing combined with fog computing is used to solve fog computing performance degradation during rush hours [4]. Augmented reality (AR) is used to simulate the on-screen communication of self-driving vehicles when meeting hazards on the road and intentional behavior. Studying the trust, usability, and experience [5], Internet of Things of self-driving vehicles users can be applied to random early detection for vehicles dynamic. It can be applied in signalized interfaces to proactively detect incipient congestion and set the best cycle and phases of traffic lights [6]. Deep learning is used for crowdedness prediction [7], short-term prediction of traffic flow [8], pedestrian behavior recognition [9, 10], driving policy for autonomous road vehicles [11], and license plate segmentation and recognition [1214]. In addition, mobile-edge computing [15], railway traffic conflict control [16], and federal learning [17] have also been applied to transportation.

Bibliometrics is a complex subject with an extensive combination of information science, philosophy, and statistics for a journal or a specific field [18]. With the help of bibliometrics indexes and tools, the characteristics, keywords, and hot topics of journals can be explored [19]. There have been some attempts of topic extraction and research trend analysis from bibliometrics in transportation. Based on the literature data published in IEEE Transactions on Intelligent Transportation Systems from 2000 to 2009, Cobo et al. [20] used coword analysis to detect, visualize, and evaluate ITS concepts and ITS subject areas. Wang et al. [21], based on articles of IEEE Transactions on Intelligent Transportation Systems (2010–2013), studied the productivity and collaboration models of Intelligent Transportation Systems. Tang et al. [22] classified the subject categories of different research fields of core articles in IEEE Transactions on Intelligent Transportation Systems (2010–2013) by coword analysis, including vehicle control technology, modeling, simulation, and image processing. Moral-Muoz et al. [23] used highly cited literature to study scientific participants who have made significant contributions to the development of Intelligent Transportation Systems. Davarzani et al. [24] used bibliometrics and network analysis tools to identify key researchers, cooperative models, research clusters, and relationships in green ports and maritime logistics. Arunachalam et al. [25] conducted a literature review of big data technology capabilities in the supply chain and established a technology capability maturity model. Tian et al. [26] used network analysis and cluster analysis to identify the trends and characteristics of transportation carbon emissions. Based on selected books or book chapters and 162 studies published in 48 academic journals between 1979 and 2018, Alexandridis et al. [27] surveyed all published research in the area of shipping finance and investment. Zhou et al. [28] analyzed 704 papers published in Transport from January 2007 to June 2019 and investigated the development of the journal in terms of the current situation and emerging trends. Through bibliometric analysis, Li et al. [29] made a statistical analysis of all journal publications, influential papers, main contributors, and main subjects of bottleneck model research in the past half-century. Meyer [30] systematically and quantitatively reviewed the literature on the decarbonization of road freight transportation with literature coupling and cocitation analysis. Abduljabbar et al. [31] analyzed 328 journal papers from the Scopus database from 2000 to 2020 to explore the changing trends of micromobility research.

At the same time, bibliometric software such as VOSviewer and CiteSpace is also helpful for the research progress and the recognition of the characteristics of the paper. For example, the papers of TR-Part B from 1979 to 2019 were analyzed with VOSviewer [32]. Liu et al. [33] used CiteSpace and VOSviewer to identify the research progress and trends in traffic prediction by bibliometrics. Based on the data of 1045 documents from Computer-Aided Civil and Infrastructure Engineering from 2000 to 2019, Wang et al. [34] analyzed the characteristics of these documents with VOSviewer and CiteSpace.

Most of the research has focused on specific areas, such as ITS, big data analytics, micromobility, and traffic forecasting, instead of conducting a systematic review of all technologies that affect transportation development. Specifically, as many technologies are being applied to transportation, there should be some research on the classification and evolution of transportation technology. However, few research studies analyze which technologies are becoming more popular or less popular, how many categories they can be grouped into, and how they have evolved over time. Therefore, this paper collected data from 36 authoritative SCI journals in transportation for bibliometric analysis in the last 20 years. And then, the distribution of traffic key technologies in recent 20 years, research hotspots with time, the changing trend of research hotspots, and the evolution and classification of traffic key technologies were explored. This paper will reveal the classification and evolution trend of the key technologies of transportation and dig out the research hotspot technologies in transportation. It is hoped that researchers of key transportation technologies can better understand the development situation regarding this study.

2. Dataset and Preprocessing

2.1. Dataset

The data used in this study come from the Web of Science Database, which is the largest, most interdisciplinary, authoritative, and influential comprehensive academic information resource. It contains more than 8,700 academic journals worldwide, covering natural science, social science, biomedicine, engineering technology, arts and humanities, and other fields.

The 36 SCI journals in Transportation Science and Technology in the 2019 JCR Report published by Corey Weian on June 29, 2020, were searched from 2001 to 2020. Generally speaking, technologies applied in the transportation field will be published in these 36 SCI authoritative transportation journals. There were 56,451 samples retrieved on April 24, 2021, and the exported record content was a full record. The fields are shown in Table 1 and the periodicals and abbreviations retrieved are shown in Table 2.

2.2. Data Preprocessing

As this paper aimed at the key technology, the technical keywords were filtered from the keywords. And the data preprocessing algorithm is shown in Figure 1. Whether the author’s keywords were empty or not should be judged firstly, and if they were, replace them with additional keywords, which could represent the article well. If both the author keywords and the additional keywords were empty, the record should be deleted, leaving 51,457 records after this step. For the retained records, the high-frequency keywords each year were counted, such as the top 100 in this article, and then the nontechnical keywords were manually deleted, such as “environment,” “resource,” and “vehicle” or other nouns. After that, the key technology word library was initially obtained. Then, the synonyms were merged, such as singular, plural, and abbreviation, to get 144 key technology words, being the final key technology word library. According to the key technology word library, papers with the keywords in the library were screened out. The nontechnical keywords were deleted from the screened records, and 15449 records were obtained finally.

3. Methodology

Based on bibliometrics, this paper used Python to program the data. In the meantime, VOSviewer and Python were used to do the data visualization.

3.1. Constructing Cooccurrence Matrix

It is generally believed that there is a correlation between the keywords given in the same paper, and the frequency of cooccurrence can express this correlation. The more the pairs of words appear in the same paper, the closer the relationship between the two words is. The cooccurrence matrix was constructed according to the frequency of words in the same paper. As shown in Figure 2, one record represents a document. Figure 2 (left) represents the keywords in the document. Figure 2 (right) represents the frequency of keyword cooccurrence, with a diagonal line representing the number of times a keyword appears in all the documents.

3.2. Cooccurrence Clustering

Cooccurrence clustering was used to cluster the key technologies, which was based on the distance relation of each item in the statistical database, with the main idea of minimizing the sum of the Euclidean metric weight distances between all individuals in each category in the matrix formed by the database. In this study, the constructed cooccurrence matrix was transformed into the format .net that VOSviewer could recognize.

3.3. Degree Centrality

The index used to measure the position of a node in a network is called degree centrality. If a node is associated with many others, it is probably at the center of the network. In bibliometrics, the number of points directly connected to the node is usually used to measure the degree of centrality. For example, the more the keyword appears together with other keywords, the more central the keyword is.

3.4. Annual Proportion

Annual proportion is the frequency of occurrence of a keyword divided by the total frequency of occurrence of all keywords in the current year. The larger the annual proportion, the more popular the research of the keyword in that year, and vice versa.

3.5. Least Squares Fitting

To get each technology’s research heat trend over time, the least squares were used to fit each technology’s proportion in 20 years. The linear regression equation was solved as follows:where is the year, is the percentage of the study, is the slope, and is the intercept.

3.6. Weighted and Cumulative Evolutionary Path

A weighted and cumulative evolutionary path method was used to analyze the evolution trend of key technologies from different angles. Formula (2) was used to calculate the weighted average occurrence time of different key technologies in the long axis of the study:where is the weighted average occurrence time, is the year the key technology emerged, and is the frequency of key technologies that occurred in that year.

For the cumulative evolutionary path, the statistical approach was as follows:where is the time the keywords appear in the cumulative evolution diagram and is the time the keyword first appeared in the study.

For each of the two methods, the frequency of occurrence of each keyword was needed to be counted. And the statistical formula was as follows:where is the total frequency of occurrence of keywords in all years of study.

3.7. Gaussian Mixture Model (GMM) Algorithm

The GMM refers to the linear combination of multiple Gaussian distribution functions. Theoretically, GMM can fit any type of distribution. It is usually used to solve the problem that the data in the same set contain multiple different distributions. Given the random variable X, the GMM can be expressed as follows:where is the k-th model of the Gaussian mixture model, and are the mean and variance of the k-th Gaussian model, respectively, and is the mixing coefficient, namely, the weight. It needs to meet the following conditions:

4. Data Visualization

4.1. Paper Volume Analysis

A bibliometric analysis of 15,549 pretreated papers was conducted in this study. To some extent, the change of paper volume represents the research status of a field. Figure 3 shows the number of papers published in all journals every year. Figure 3(a) illustrates the annual volume, and Figure 3(b) represents the cumulative volume of publications. As shown in Figure 3(a), the overall volume of publications increased slowly from 2001 to 2014. However, it had a relatively rapid increase from 2014 onwards, and especially after 2017, it increased almost exponentially. Figure 3(b) shows an exponential rise in the cumulative curve of publications. Since 2014, research on applying transportation key technologies to solve traffic problems has developed rapidly, and more and more scholars pay attention to it.

In addition, the annual volume for each journal and the volume for each country were also calculated, as shown in Figure 4. Figure 4(b) only displays the top 30 countries. As can be seen, the journal IEEE T Veh Technol has the most significant volume (4,951), more than three times of IEEE T Intell Transp, indicating that it pays more attention to the study of key technologies of transportation. In terms of country, the US is the most significant contributor (3,477), followed by China (2,903), nearly three times as many as the third largest, Canada, showing that the US and China have made significant contributions to this area.

4.2. Distribution of Key Technology Hotspots over Time

To explore the research focus of key technologies each year, this paper made statistics on the traffic volume according to the annual division, as shown in Figure 5, showing the top 10 traffic technologies. Optimization, vehicle technology, and transportation technology were popular in the past 20 years. Optimization and vehicle technology research in the last 20 years was in the top 10, the main technology applied in transportation.

Figure 6 gives a more direct view of the percentage of each technique studied each year. Figure 6(a) shows the percentage of each traffic technology over time, with an area representing the percentage and a larger area representing more papers and the more popular research. Figure 6(a) shows only the top 10 technologies, and Figure 6(b) shows the annual percentage change. As shown in Figure 6, some techniques like optimization and vehicle technology had little difference in popularity and were primarily at their peak. Some others, such as CDMA (code division multiple access), were decreasing year by year.

On the contrary, some technologies, like EV (electric vehicle), were increasing year by year. And some technologies first increased and then decreased, such as OFDM (orthogonal frequency division multiplexing), with significant fluctuation. In addition, some techniques such as GA (genetic algorithm) were fluctuating, but the overall proportion was relatively stable.

4.3. Classification of Key Technology Change Trends

To distinguish trends in technology, this paper defined the technology categories as “hot technology,” “cold technology,” and “other technologies.” The hot technology represents the technology becoming more popular, the cold technology means the technology is becoming less popular, and others are other technologies. This paper used the least square fitting method to fit the percentage change of each technique in 20 years and judged the technique category according to the slope range of the fitting line. The fitting result is shown in Figure 7, where the curve represents the original data, and the straight line represents the fitted linear regression equation. According to the experiment, hot technology was with a slope k1 of more than 0.00005, and cold technology was less than −0.00005, otherwise for other technologies. As shown in Figure 8, the 144 techniques are divided into 60 hot technologies, 46 cold technologies, and 38 other technologies.

To a certain extent, the trend of changes in the number of publications can reflect the research enthusiasm of technology. However, in the research field, the citation rate of papers is usually used as an important reference index for research hotspots. In order to further explore the research hotspots, the least squares method is also used to fit the citations of all technologies to obtain the slope k2. We use the GMM algorithm to cluster the published volume slope k1 and the cited slope k2 of each technology. The clustering results are shown in Figure 9(a). It is found that, due to the excessive amount of publication and citation rate of some technologies, such as 5G, deep learning, and EV, the clustering effect is not very good. Therefore, we define the technologies with slope k1 greater than 3 and cited slope k2 greater than 21 as super-key technology. The remaining technologies are grouped into 3 categories, which can be seen in Figure 9(b). In Figure 9(b), the red category is defined as a key technology, the green category as a relative key technology, and the purple category as a lower-key technology.

By further cluster analysis using the GMM algorithm, the 144 technologies were divided into 13 super-key technologies, 60 key technologies, 59 relative key technologies, and 12 lower-key technologies. All the technologies and their classification are shown in Table 3. Among them, k1 and k2 of super-key technology are much larger than the others. For key technology, k1 is greater than 0.3 and most k2 is greater than 0. Both k1 and k2 of the relative key technology are near 0. It is worth mentioning that k2 of all lower-key technologies is less than 0, even though k1 of some lower-key technologies is larger, which shows the importance of citation rate in research hotspots.

4.4. Analysis of Key Technology Evolution

To further explore the evolution trend of key transportation technologies, this paper analyzed the weighted evolution and the cumulative evolution trend, as shown in Figure 10. Figure 10(a) represents the weighted evolutionary trend, calculated by formulae (2) and (4). Figure 10(b) illustrates the cumulative evolutionary trend, calculated by formulae (3) and (4). As shown in Figure 10, there were a few key technologies before 2012, while many technologies had been applied for transportation 20 years ago, being contradictory. Few articles were published before 2012, resulting in the overall research time zone being tilted back when weighted. Figure 10(a) shows that the weighted time zones for key technologies were mainly concentrated in 2014–2018, and the technologies with high volume were mostly “vehicle technology,” “optimization,” “EV,” and “simulation” technologies. The later the weighted time zone is, the later the technology becomes popular, representing the current research hot topics; for example, “digital twin,” “blockchain,” and “federated learning” appeared close to 2020, indicating that these technologies were the latest of transportation research.

The abscissa in Figure 10(b) shows the year the technology first appeared. As can be seen, most of the technologies had been in the field of transportation 20 years ago. During 2009–2016, a few new technologies appeared, and only one new technology emerged in 2012 and 2013. The emergence of new technologies had increased over the period 2017–2020. It is worth mentioning that “deep learning,” “5G,” “Noma (nonorthogonal multiaccess),” and “Edge Computing” appeared in the field of transportation lately. Still, the volume of papers on these technologies was relatively large, showing that the technologies were new and popular in transportation research.

4.5. Cooccurrence Clustering Analysis of Key Technology

Based on the above research, the key technologies of transportation were divided into five categories using VOSviewer. As shown in Figure 11, the circle represents the size of degree centrality. The solid line represents the cooccurrence of the two technologies, and the dotted line indicates the same category. The five kinds of technologies are shown as follows:(1)Vehicle technology and control. It contains vehicle technology, electric vehicle technology, automated driving technology, and vehicle control technology, such as optimal control and adaptive control technology.(2)Optimization algorithms and simulation techniques. It mainly includes optimization, genetic algorithm, particle swarm optimization, integer programming, and simulation technology.(3)Artificial intelligence and big data. It mainly includes artificial intelligence, deep learning, neural network, big data, data mining, and data analysis.(4)Internet of Things and computing. It mainly includes Internet of Things technology, vehicle network technology, Privacy Protection Technology, cloud computing, and edge computing technology.(5)Communication technology. It mainly includes wireless communication technology, 5G technology, multi-input multioutput technology, LTE technology, and CDMA access technology.

5. Conclusion

This paper made a bibliometric analysis of the papers published by 36 authoritative SCI journals in the field of transportation in the last 20 years. In terms of the volume of papers published in journals, it increased slowly from 2001 to 2014, but it had grown rapidly since 2014, and significantly since 2017, it had increased almost exponentially. As for national publication volume, the US was the largest (3,477), followed by China (2,903), nearly three times as many as Canada, the third largest. In the time distribution of research hotspots, some technologies such as optimization and vehicle technology had been popular for nearly 20 years, being the leading technologies applied in transportation. Some technologies, such as CDMA, were getting less and less popular. Some were just the opposite, such as EV. The research heat of some technologies such as OFDM first increased then decreased, with a significant fluctuation. And there were technologies, such as GA, having been fluctuating, but the overall ratio was relatively stable. To distinguish trends in technology, 144 techniques were classified according to the slope range of the linear regression equation by least squares into 60 hot technologies, 46 cold technologies, and 38 other technologies. To further explore the research hotspots, the least squares method was also used to fit the citations to obtain the slope k2. The GMM algorithm was used to cluster k1 and k2 of each technology. As a result, the 144 technologies were divided into 13 super-key technologies, 60 key technologies, 59 relative key technologies, and 12 lower-key technologies. This paper analyzed the evolution of technology from the perspective of weighted evolution and cumulative evolution. The latest popular technologies in transportation were “digital twin,” “blockchain,” “Federated Learning,” and so on, and “deep learning,” “5G,” “Noma,” and “Edge Computing” were hot technologies with significant publication volume in recent years. The key technologies of transportation are divided into five categories by cooccurrence cluster analysis: (1) vehicle technology and control class; (2) optimization algorithm and simulation class; (3) artificial intelligence and big data class; (4) Internet of Things and computing class; and (5) communication technology.

This paper classified and analyzed the key technologies in transportation, essential for the application research. It is needed to note that this study did not reveal the evolutionary nature of key transportation technologies, which is beyond the scope of this paper and could be further explored in future research.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this study.

Acknowledgments

This work was supported by the National Key R&D Program of China (Grant no. 2020YFB1600400).