Abstract

The identification of disruptive technologies is intended to focus on training and incubation in advance and is an important means to accelerate the upgrading of industrial structure and the transformation of developmental mode and seize the commanding heights of future development. Based on summarizing the existing major identification methods of disruptive technologies, this paper concludes the rule that “disruptive technologies are always at the root node of a certain classification in the deeply classified technology development network”. It also proposes a new algorithm to use term frequency-inverse document frequency technology and a patented Subject-Action-Object structure to extract technical features, develop networks based on similarity matrix generation technology, and identify subversive technologies based on the depth classification model. Using patent data, it is found that the technology development network generated by the algorithm proposed in this paper can effectively show the trajectory of technology development by fitting the patent citation relationship. Through this algorithm, we have successfully identified technologies that have had disruptive effects in the field and verified the effectiveness of this algorithm.

1. Introduction

Technological innovation is the first driving force of economic development. Developing innovative technology is an important strategic direction to accelerate the upgrading of industrial structure and the transformation of developmental mode, and seize the commanding heights of future development. Disruptive technology is an important part of technological innovation. It breaks the original technology lifecycle and builds a new technology track to realize the replacement of mainstream technology and even cause the change of technology paradigm. In the Fourteenth Five-Year Plan for National Economic and Social Development of the People’s Republic of China and the Outline of Vision, Goals for 2035, it is pointed out that we should take self-reliance in science and technology as the strategic support for national development, seize the initiative in science and technology development, organize the implementation of the future industry incubation and acceleration plan in the frontier science and technology and industrial reform fields, and plan to lay out several future industries. Early identification of disruptive technologies can accelerate the realization of this goal. However, disruptive technologies are extremely complex and uncertain, and it is often difficult to predict before they have a significant impact on the market. Given this, it is of great practical significance to strengthen the research on the identification methods of disruptive technologies, establish a reasonable and effective identification model, and carry out early identification, prediction, and early warning of potentially disruptive technologies.

This paper takes a large number of technical texts as the research object and takes the retrieval of technical texts in a certain field as the identification field of subversive technology. Taking the radio frequency identification (RFID) field as an example, by retrieving the RFID-related data of the Engineering Index journal as the identification object, the data is segmented and extracted by model, and the types, application scope, technical description, functional description, and other documents of the technology involved in each journal are obtained. A technological development context in this field is constructed by calculating the similarity of the text content. The technological development context is cut according to the degree of technological similarity, thus forming a technology group that integrates technologies. For example, in radio wave technology and data storage technology in RFID, each technology group is independent of the other, and subsequent technologies continue these technologies, which makes these technologies have a profound impact on the follow-up and become subversive technologies. Such technologies are usually the main identification targets of subversive technologies. Among them, Liu et al. [1] and other technologies have formed a separate technology group in the context of technological development. This technology, which makes a lot of adjustments to the technology on the premise of referring to a large number of technologies, will form new subversive technologies in the case of continued research in the future development process, which is the main prediction goal of subversive technologies.

The rest of the paper is organized as follows: In Section 2, we introduce related works and our contribution. In Section 3, we introduce our research framework. In Section 4, we select data to validate and illustrate the relevant parameters of the algorithm. Finally, the conclusion and further work are given.

2.1. Related Works

Focusing on the identification of disruptive technologies, this paper first reviews the basic concepts of disruptive technologies, such as reviewing the research progress, understanding the relevant research fields, and perspectives, further combing the identification system of subversive technologies and summarizing the subversive identification algorithms from subjective and objective aspects.

At present, the concept of disruptive technology is mainly differentiated into two dimensions: technology and market impact. At the technical level, Nagy et al. defined disruptive technologies as technologies that can provide new functions, discontinuous new technology standards, and new forms of ownership, and can change market standards and consumer expectations based on the three technical attributes of functional, technical standards, and ownership [2]. Kivimaa et al. emphasized the importance of disruptive practices and low-technology solutions as well as disruptive technologies and policies from the perspective of the market and business models, regulations, and policies, as well as participants and networks [3]. Kumaraswamy et al. believed that systematic features should be considered in the effective identification of disruptive technologies [4]. Adner introduced two new structures; preference overlaps and prefers symmetry, to characterize the relationship between the preferences of different market segments [5]. Tabbah and Maritz believed that disruptive innovation positively impacted the economy, consumers, and society [6]. Müller et al.experts consulted to build a risk model to assess the disruptive potential of technology [7]. Danneels believed that the basis of competition could be changed by changing the enterprise’s competitive performance indicators [8].

The subjective judgment algorithm is the main direction of the early research on subversive technology. Its main feature is that it depends on expert judgment standards and has strong subjectivity. Such identification methods mainly include the Delphi method, the technology roadmap method, and the scenario analysis method. For example Sommarberg and Mäkinen used the wisdom of industry experts to predict potential disruptive technologies in the industry through visual simulation scoring [9]. Guo et al. built a measurement framework from the perspectives of technical characteristics, market dynamics, and external environment and invited experts to analyze to assess the potential of disruptive technologies [10]. Kim et al. identified early subversive technologies based on keyword clustering graphs, keyword intensity graphs, and keyword relationship graphs [11]. Benzidia et al. use customer relations and channels to analyze customer satisfaction to judge the subversion of technology [12]. Pandit and others investigated how dynamic capabilities at the enterprise level can promote disruptive technology performance [13]. Rathore et al. established an interpretative structural modelling method with the help of the fuzzy Delphi method to identify obstacles to the use of subversive technologies in evaluation [14].

To solve the impact of high identification costs and the subjective consciousness of experts in subjective judgment, some scholars introduced mathematical models for analysis, Cheng et al. introduced SIRS popular model to predict disruptive technology application fields through patent data, and selected RFID technology to verify its main explosion probability and potential damage in different application fields [15]. Momeni and Rost subject modelling methods based on the patent development path, k-core analysis, and technology development trend identify technologies that may become disruptive technologies [16]. Rodriguez Bolivar and Alcaide Munoz conducted a cluster analysis on the role of emerging technologies in public service provision to determine the network of disruptive technologies [17].

In addition to introducing mathematical models, scholars have also used bibliometrics to identify and predict disruptive technologies. Dotsika and Watkins proposed a document-driven method that uses keyword network analysis and visualization methods to reveal emerging themes, structures, and time development in publications and predict potential disruptive technology trends [18]. Li et al. use patent analysis and Twitter data mining to monitor the emergence of disruptive technologies [19]. Adamus and Thampi use time series data to make disruptive technical predictions [20]. Brackin et al. further integrating the theory of technological progress, we selected the technology readiness index and technology maturity curve as two indicators to measure technological progress and discussed the rationality of measuring disruptive technology based on big data analysis technology [21].

Current scholars tend to build evaluation systems to identify disruptive technologies rather than mathematical models. Liu et al. studied two new methods. One is to obtain a list of potentially destructive technologies from experts, and then use a multidimensional indicator system to evaluate the destructive potential of those technologies. The other is to generate a list of potentially destructive technologies of mining multi-source data, and then experts evaluate the destructive potential of technologies [22]. Jia et al. built a framework for identifying disruptive technologies in China’s electronic information and communication industry by analyzing keyword co-occurrence networks and keyword frequency changes in CNKI literature through CiteSpace software [23]. Wang et al. used the maximum likelihood fitting method and goodness of fit test to identify subversive technologies through the degree distribution characteristics of evolutionary analysis networks [24]. Jia et al. used the three attributes of technological novelty, break through, and influence as measurement indicators to identify disruptive technologies [25]. Dotsika and Watkins proposed a literature-driven method to predict the trend of potentially disruptive technologies. They used keyword network analysis and visualization methods to detect potentially disruptive technologies through eccentricity and remoteness indicators [18]. Chen and Han used Gartner Hype Cycle technology as training data and used machine learning models to identify disruptive technologies [26].

2.2. Our Contribution

This paper believes that the technological development context has a guiding role in identifying disruptive technologies. Most of the identification methods of disruptive technologies essentially depend on the technological development context, and the relevant discussion content has been recorded in the reviews of many scholars [27]. However, in previous studies on disruptive technologies by scholars, no scholar has yet built a technology development network based on the technology development context by extracting the technical characteristics of technical literature and combining time dimensions to identify disruptive technologies. This paper explores the identification algorithm of disruptive technologies based on the technology development network from this perspective.

3. Materials and Meth

3.1. Theoretical Basis

By summarizing the identification of disruptive technologies in recent years, it can be found that disruptive technologies are always at the root node of a certain classification in the depth classification technology development tree. Taking patent data as an example, Figure 1 shows a tree of patent citation relationships for an industry’s in-depth classification. S1, S2, S3, and S4 are subcategories of the industry. Node Pi represents patents, and the connection between nodes is a patent citation relationship. In addition to representing the citation relationship, the tree graphs of the patent citation relationship is also arranged from left to right according to the patent application time and from bottom to top according to the number of classified patents (see Figure 1).

The patent application time is arranged from left to right, and the number of classified patents is arranged from bottom to top.

The P0, P3, and P9 in the figure are typical forms of subversive technology. Among them, the P0 patent is at the beginning of the industry and is the basic patent that lays the foundation of the industry. It conforms to the characteristics of disruptive technology and is the first subversive technology patent in the industry. P3 patents are classified differently from the parent node, and the objective reason for the difference is that the technology and application of the patents have changed. Compared with other patents, P3 patents have made great progress, leading to a change in the direction of technology development, which makes the patents meet the requirements of subversive technology. P9 patents have the characteristics of technological discontinuity and have a great impact on subsequent technological development, meeting the characteristics of disruptive technologies. P0, P3, and P9 are the identification objects of subversive technologies, while P14 is a newly applied patent, which is a prediction object of subversive technologies with significant changes in technology and application compared with previous patents in the industry.

3.2. Research Framework

Based on the theory that disruptive technologies are always at the root node of a certain classification, this paper proposes to build a technology development network based on the technical characteristics of technology, such as technical problems, technical effects, and technology applications, combined with the time dimension, and use the depth classification model to identify disruptive technologies based on the technology development network. Figure 2 is a research framework diagram designed according to this algorithm. This method takes patent data and scientific and technological documents as identification objects. After filtering the retrieved data, based on the abstract of patent data, claims, and technical efficacy sentences, the text is divided according to the technical type, scope of application, technical description, and functional description. TF-IDF is used to locate the text corresponding to the technical characteristics of the text, and CoreNLP is used to extract the SAO structure data of the text. The similarity matrix is constructed according to the SAO structure data, and the technology development network is generated according to the patent similarity and the time dimension. Finally, the technology development network is divided into different structures through a deep classification model to screen out disruptive technologies.

3.3. Technology Development Network Generation

TF-IDF (term frequency-inverse document frequency) technology [28] is a word frequency statistical algorithm of concept superposition. TF is the frequency of keywords in the text, and IDF is the reciprocal frequency of keywords in the corpus. As shown in formula (1), TF-IDF is the product of TF and IDF, where and are the importance of the word x in the article y, tax and y are the frequency of the word x in the article y, dfx is the frequency of the word x in the corpus, and N is the number of articles in the corpus. TF-IDF can highlight the importance of any words in the text, and then can tap the key technical features of the patent text.

The SAO structure [29] is a method proposed by Genrich Altshuller based on patent text data to solve the problem of system representation. The subject S (Subject) and object O (Object) are usually composed of nouns or noun phrases. Action A (Action) represents the behaviour or relationship between S and O, and these three elements are the most basic components of a sentence. This structure can reflect various information in the patent text data and clearly describe the patent technology organization; CoreNLP deep learning model can extract the SAO structure according to sentence structure.

Cosine similarity [30] refers to the difference between two individuals based on the cosine value of two vector pins in a vector space. When calculating text similarity, the cosine similarity of two documents generally ranges from 0 to 1. In the following formula is the cosine similarity formula to calculate text similarity. In the formula, Ai and Bi are the frequencies of different words in two texts, fAi and fBi are the word frequency vectors of two texts, and SAB is the similarity of A and B.

The purpose of creating a technology development network is to find out the main context of technology development, that is, to build a technology development network from the perspective of time latitude and technology characteristics. This paper proposes a technology-similar connection method, as shown in the following formula, which can quickly build a technology development network. Among them, Lij is the technical distance between technology i and technology j, as well as the connection distance in the technology development network, Ti and Tj are the discovery time of technology, and Sij is the textual similarity between technology i and technology j.

3.4. Technical Classification

To classify the technology development network, according to different technical characteristics, this paper proposes a depth classification model, as shown in the following formula. In the technology development network in which patents are sequenced from 0 to N, (m, n) is a group of classifications of sequence m to n and Pi is the same as the technical distance of Lij in formula (3), which is the technical distance of adjacent technologies in this group. Its sequence i is a new sequence calibrated according to the technical network generated in formula (3). In the process of classification, the smaller the value of (m, n) is, and the greater the difference between m and n is, the greater the difference between this classification and other classifications can be.

4. Results and Discussion

4.1. Data Sources

The patent database used in this paper is the incoPat database. The search time is up to July 2022. The search scope is related to patents for inventions or utility models of new energy vehicles applied in China. After the search results are obtained, the data are cleaned through manual indexing to remove irrelevant patent data. A total of 9382 related patent data is obtained. When extracting the technical features of patent text, it is found that the technical features of the patent are mainly concentrated in the abstract, claim, and effect. Therefore, the extracted patent includes time information, as shown in Table 1.

4.2. Feature Extraction

When acquiring the technical features of the text, it is necessary to segment the text first and segment the patent text by technical type, application scope, technical description, and functional description. Because the patent text has strong format requirements, this experiment divides the text by writing a program, and the processing results are shown in Table 2.

Before extracting the SAO structure of each type of text after segmentation, first segment the text and eliminate the stop words, then disambiguate the word meaning and merge the synonyms, use TF-IDF technology to locate the patent core vocabulary, intercept the corresponding text information, and finally, extract the SAO structure of the text information, so as to obtain the patent technology description features of different dimensions. Table 3 is the schematic diagram of the SAO structure of the patent text information extraction.

Since the development of new energy vehicles, batteries and driving systems are two major development constraints. Table 4 shows the subject clustering results of SAO structure text data extracted according to patents, including 4468 battery theme patents and 1456 charging theme patents, whose subject distribution proportion is consistent with the development of new energy vehicles, which confirms the effectiveness of the TF-IDF positioning patent key text method and SAO structure text.

4.3. Technology Development Network

Using the patent SAO structure text, build a similarity matrix, as shown in Table 5 that contains the technical similarity between any two patents. Through the analysis of the similarity matrix, it is found that the patent similarity values are evenly distributed between 0 and 1, and the patents are dispersed in the form of partial aggregation, which conforms to the distributive law of technological development and coincides with the assumption of technological classification.

To test the reliability of the method, five sets of patent data with citation relationships are used to verify the effectiveness of developing the relationship network. Figure 3 is the clustering process diagram generated by clustering a group of data according to the similarity matrix and formula (3). There are 125 patents data in this group, and the patents with similar technologies are interconnected, resulting in 3 groups of patents.

The technology development network represents the main context of technology development, and the patent citation relationship also represents the development process of patent technology. Table 6 shows the comparison data between the technology development network and the patent citation relationship. It can be seen from the comparison that the constructed patent technology development network conforms to the law of patent technology development and can be used as an effective basis for the analysis of patent technology development.

4.4. Patent Identification

Taking the partial patent subdata as an example, a technology development network is built based on patent information, and patents are aggregated and classified by formula (4). The selected patents are shown in Table 7. The results are shown in Figure 4. This group of patents is 45 patents related to new energy vehicle batteries, and the classification results show three major branches. The blue part mainly focuses on the battery series structure, and the No. 39 patent is the beginning of this subject branch. However, the No. 42 patent has significant subversion and can better represent the subject of this group of patents, so it becomes a subversive group of subversive patents.

The experiment proves that the patent technology development network can effectively show the technology development context. According to the combination of technology development networks, formula (4) is effective for the classification of patent development networks. Through the constant adjustment of the number of patent classifications, it is found that the number of technology classifications directly affects the identification effect of disruptive technologies. The higher the depth of technology classification, the more disruptive technologies, are identified and the lower the degree of disruption. The lower the depth of technology classification, the less disruptive technologies are identified, and the higher the degree of disruption.

5. Conclusions

Based on summarizing the existing disruptive technology concept research and identification methods, according to the technology development context, this paper proposes the concept of “disruptive technology is always at a certain classification root node position in the depth classification technology development tree”, and designs the technology route according to this concept. In the first step of the technology route, the technology development network extracted through the NLP-related technology, when compared with the patent citation relationship network, its average F1 value, accuracy rate, and recall rate is 73.24%, 69.89%, and 72.56%, respectively, which can show the technology development network and verify the effectiveness of the technology development network. In the research on the classification of technology development networks in the second step of the technical route, it is found that as the classification depth of technology development networks decreases, the number of identifying disruptive technologies decreases, and the degree of disruption increases. When the classification depth is moderate, it can be found that some new technologies are in a certain category alone or on a small scale, which has a high probability of becoming disruptive technologies in the future. From the experimental results, it can be seen that the concepts and identification schemes proposed in this paper can effectively identify disruptive technologies, and to a certain extent, provide a valuable reference for relevant researchers and managers.

In the research on disruptive technology identification and prediction in this paper, the results of disruptive technology identification are relatively accurate, but from the proportion of the number of disruptive technologies to the total technology, the number of disruptive technologies in the prediction results is relatively high compared with the actual number. The result of this phenomenon may be due to the fact that the predictions include new technologies that are innovative but not practical. The research found that the number of citations, the scope of use, and the value of technology is strongly related to whether they are disruptive technologies. Future research will introduce more relevant factors into the model to improve the accuracy of disruptive technology prediction.

Data Availability

The data used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Special Project for Scientific and Technological Cooperation of Jiangxi Province Grant Number (20212BDH80021) and the Jingdezhen Ceramic University 2022 Graduate Innovation Special Fund Project (No: JYC202206).