With the continuous expansion and deepening of international exchanges and cooperation, language differences have become the biggest obstacle to exchanges and cooperation. At present, English translation software is the translation between natural languages made by humans on the basis of computers. As an important technical means to break through language barriers, machine translation is a software that uses computers to process spoken or written language. This paper aims to study the innovative system of English translation software, using the Internet of Things and big data algorithms to analyze some loopholes and deficiencies in the innovative system, and provide some data research for the system update. This paper proposes to analyze the innovative system of translation software based on the big data of the Internet of Things (IoT). Among them, the big data of the Internet of Things chooses the dynamic frame time slot ALOHA algorithm and the lower limit estimation algorithm. Its application in system analysis is very innovative. The system consists of a translation engine and a machine dictionary, which solves the basic problem of insufficient translation database of current translations, and plays a role in helping the English translation software innovation system to improve the accuracy of translations. The experimental results of this paper show that the translation system’s translation software achieves 61.3% of the A-target fluent translations, and 22.7% of the B-targets can translate the original text. It can be seen that English translation software has important research significance.

1. Introduction

With the increasing globalization of the global economy and the continuous and rapid development of the world’s Internet applications, international Internet exchange activities and scientific and technological cooperation will become more and more frequent and the division of labor is meticulous. Language differences have become the two biggest language barriers in the current international information technology exchange and learning and scientific and technological cooperation and development. It is also becoming increasingly important in the global economic development environment and the field of contemporary social life, and it is one of the main focuses of current research and discussion.

Entering the 21st century, with the gradual intensification of international cooperation, exchanges and competition in society, and the arrival of the global Internet era, the level of social informatization in society as a whole will inevitably become higher and higher. In order not to be gradually eliminated by the tide of this era, people have to keep up with the tide of the times and always pay attention to the latest social dynamic information at the forefront. Most of the above dynamic information materials are also the first time that they appear publicly in the form of pure foreign languages on the modern Internet media. Therefore, despite the rapid development of Internet language in recent decades, the existing language machine translation system is still unable to guarantee a satisfactory overall translation effect, especially the translation of some complex sentence sequences and complex chapters. Therefore, the comparative research method of machine language translation effect should not only have important scientific value, but also must have practical application value. The most meaningful point of view is that under the requirements of the development situation in the new century, it is of extremely practical and far-reaching significance to further develop and study language machine semantic translation comparison technology and improve the overall translation effect in the existing machine language translation comparison system.

The novelty of the article is as follows: (1) this article analyzes the English translation software system by using the method of Internet of Things big data, which greatly improves the analysis efficiency and the evaluation speed of system. (2) Based on the theoretical basis of designing a language framework system, this paper successfully designs and implements other practical functions in the machine vision translation language system. One of the key contents is the realization of the grammar design function of the English-Chinese electronic dictionary system. And it realizes the automatic alignment of English-Chinese sentence structure and the semantic design function of English-Chinese sentence translation system. (3) The translation thesaurus system specially researched and manufactured based on the big data platform of the Internet of Things can also be easily used for various expansion methods such as increasing the number of translation instances and expanding the translation vocabulary. Therefore, it can help people avoid some traditional rules-based machine language translation comprehension methods and apply this method to some more deep linguistic problems and analysis that must be reinterpreted.

The Internet of Things is the “Internet of Everything Connected,” which is an extension and expansion of the Internet based on the Internet. It is a huge network formed by combining various information sensing devices with the network, realizing the interconnection of people, machines, and things at any time and any place. Tallapragada V tracked customer sentiment and provides customer behavioral insights through an IoT-integrated data intelligence system running on Apache Spark Cluster [1]. Philemon Kibiwott proposed a more efficient scheme that borrows computing power from cloud servers to handle expensive computations, while leaving simple operations to users [2]. Jamil et al. proposed a new approach in which IoT data are processed and analyzed in real time using big data tools [3]. Ahmad and Afsal believed that fog computing originates from cloud computing and is implemented on the end user’s device to deal with network latency [4]. Shin et al. applied a system for processing informal data by collecting, storing, processing, and analyzing data [5]. Rajeswari et al. used IoT devices to sense agricultural data and store it in a cloud database, using cloud-based big data analytics to analyze the data [6]. Zhao et al. proposed an IoT data credibility detection method based on regional context, which can effectively detect point anomalies, behavior anomalies, and context anomalies [7]. Granat et al. proposed an event detection method for one-dimensional data streams, which relies on an event intensity function and is an extension of the typical “true or false” decision scheme [8]. Wei et al. provided a comprehensive review of IoT data flow management. We first analyze the key challenges faced by IoT data flow and provide a preliminary overview of related technologies in data flow management, spanning data flow perception, mining, control, security, privacy protection, etc. [9]. Palaniswami et al. presented a useful set of visual assessment tools and techniques for clustering trend (VAT). It further highlights how these technologies are advancing IoT through large-scale IoT implementation [10]. Meerja et al. studied IoT data algorithms to share their data with the online world to obtain global knowledge and information with high commercial value [11]. This work by Almeida A involves key performance analysis of an IoT-aware ambient-assisted living (AAL) system for elderly monitoring. The analysis focuses on three main system components: (i) a city-wide data acquisition layer, (ii) a centralized cloud-based data management repository, and (iii) a risk analysis and forecasting module [12]. Ko and Kim’s goal in research is to develop a healthcare platform. The platform can receive diabetes information based on remote input, storage, analysis, processing, and visualization of diabetes information generated by various IoT [13]. However, the practicality of these experiments is not too strong, and they only stay in the preliminary theoretical stage.

3. IoT Big Data

3.1. Dynamic Frame Slot ALOHA Algorithm

When the dynamic frame slot ALOHA algorithm faces the number of tags existing in the tag reader, the number of dynamic frame slot tags existing in a dynamic frame domain can also be changed. It is one of the most widely used DFSA algorithms in the Internet of Things [8]. In the DFSA algorithm, it is assumed that the initial frame rate range of the reader to be recognized is L, and the range of the number of tags within the time range set by the reader to be identified is m. Then, the probability that the reader is set with t number of tags at the same time slot to be selected and identified at the same time is

According to formula (1), the expected value of the idle time slot when t = 0 can be obtained:

When t = 1, the expected value of correctly identifying the time slot is

The expected value of the collision slot when t ≥ 2 is

After one round of recognition process, the correct recognition rate of the whole system is

According to formula R, the derivation of the frame length L can be obtained:

It can be obtained from the above formula that when L ≈ m + 1, the working state of the system reaches the best, and theoretically the maximum throughput rate of the system can be obtained from the following formula to be 36.8%.

Through the logical derivation and analysis of the above formula, it can be known that during each round of tag recognition in the system, the DFSA algorithm dynamically adjusts the frame length so that the frame length set by the tag reader is approximately matched or basically equal to the number of tags before the system is fully recognized by the tag. Only then can the system maximize the label throughput rate of the entire system.

3.1.1. Lower Limit Estimation Algorithm

The lower limit estimation algorithm assumes that in the identification process of each round, if there is a time slot collision phenomenon in the system, there are only two tags in each collision time slot. Therefore, in this round of label recognition process, the number of labels to be recognized in the next round of recognition process estimated by the lower limit estimation algorithm is

The algorithm assumes that only two tags will respond if they choose the same time slot. However, in practical applications, there may be more tags to select the same time slot, so the estimated number of tags in the above situation may be quite different from the actual number of tags. It can be utilized in the instance part and the structure construction part of the system analysis. The use of dynamic frame time slot algorithm to carry out the evaluation stage of the system can make the system evaluation result more accurate.

3.1.2. Three-Dimensional Estimation Algorithm

The 3D estimation algorithm is an effective image denoising algorithm. By matching with adjacent image blocks, several similar blocks are integrated into a three-dimensional matrix, filtered in three-dimensional space, and the result is inversely transformed and fused to two-dimensional to form a denoised image. The denoising effect of this algorithm is remarkable, and the highest peak signal-to-noise ratio can be obtained, but the time complexity is relatively high.

The three-dimensional estimation algorithm is that in the process of reader identification, if t tags select time slot j at the same time, the distribution function of the time slot can be obtained according to the above formula as follows:

Then, the average number of tags in slot j is

The three-dimensional estimation algorithm estimates the number of tags according to the number of tags et and collision probability Pc in the collision time slot in the system, and its expression is as follows:

The number of tags to be identified in the next round is

3.1.3. Schoute Estimation Algorithm

In the Schoute estimation algorithm, assuming that the number of tags in the selected time slot j obeys the Poisson distribution with a mean of 1, then the probability that time slot j is simultaneously selected by t tags is

When the time slot collides, the number of tags in the selected time slot j can be obtained by the formula:

Then, the number of tags to be recognized is

3.2. Ant Colony Algorithm in Big Data

Assuming that in a certain environment network, there are m routing request points waiting for routing processing, there are at least n routing access points waiting for routing to perform tasks, and there are at least x ants in each routing group network. Ants will only randomly leave a small amount of pheromone on a certain path node during the whole process of foraging. The pheromone concentration will gradually volatilize with the arrival time. The next ant will only randomly re-select the path based on the remaining pheromone concentration after reaching that moment [9]. The probability that the kth ant chooses to transition from path i to path j is , and then the calculation formula of is as follows:

Among them, τij represents the pheromone in the set from path i to path j, ηij represents the path length from path i to path j, allowk represents the set of paths that the kth ant is allowed to pass through (unvisited), α and β are control parameters, and the value range is [0, 1].

When ants are looking for the optimal path, the update strategy of pheromone concentration is as follows:

It can be seen from the above that the use of dynamic frame time slot ALOHA algorithm, lower bound estimation algorithm, and three-dimensional estimation algorithm in the Internet of Things and big data can be used to analyze innovative systems in English translation software. Then, there is a clearer analysis of its structure, rules, and some specific instances, and the ant colony algorithm is used to evaluate the system, which can make the evaluation results more accurate.

4. English Translation Software

4.1. English-Chinese Machine Translation System
4.1.1. Rule-Based Approach

The language method based on the study of language rules assumes the logical process of translation itself to understand and analyze other kinds of languages [14, 15]. Whether this is a transition language-based translation method or an intermediate transition language-based translation, the emphasis is only on the difference between their comprehensive analysis ability and in-depth mastery of the connotation problems of the switch language and the degree of high abstraction and generalization of the knowledge of the switch language and the connotation of the expression, as shown in Figure 1.

Rule-based methods have always dominated the machine translation community and still play an important role today, and the influential MTSs are now rule-based. However, compared with the traditional rule-based method, the current rule-based method has undergone many changes. These changes are mainly reflected in the following: in the acquisition of rules, traditional rule methods mainly rely on linguists to summarize rules and debug them. However, more emphasis is now placed on automatically obtaining rules from corpora. Traditional rule methods tend to focus on describing coarse-grained and globalized knowledge of large-scale linguistic rules. However, more attention is now paid to describing fine-grained, local, and small-scale linguistic knowledge, showing a trend of “small rule base, big dictionary.”

To really develop such a machine semantic translation system based on natural language rules, people must first design a semantic knowledge representation system. All the language knowledge that may need to be used in the process of natural language translation is expressed in the form of an operational language that can be implemented on a computer [1618], as shown in Figure 2.

4.1.2. The Overall Frame Structure of the System

The system framework is roughly divided into the following three parts:

(1) Translation System Knowledge Base. The learning process of machine language translation can actually be regarded as a process of learning language application-related knowledge to perform semantic reasoning operations. The representation of knowledge is the theoretical basis for understanding the translation process [17]. People often divide the various kinds of knowledge used in machine intelligent translation language into two categories according to the form of representation, such as knowledge inside the machine and knowledge outside the machine.

(2) System Processing Part. The English vocabulary processing part includes automatic English word segmentation processing and English word segmentation processing. These two parts mainly include the theoretical basis of phrases and sentence combinations of English sentence patterns. The automatic word segmentation system uses the highest matching algorithm. And the system can use various rule analysis and statistical analysis methods. In the part of automatic elimination and integration of English rule analysis, the system is divided into two parts: phrase analysis and sentence pattern matching.

(3) User Interface. The debug user interface used is usually divided into a system client and a debug user interface that can be managed by a system administrator. A very good visual effect and practical and concise machine learning translation and dictionary system compiler compilation management page and the management page of the entire compiler debugging work system, which can be specially designed to use a machine translation system, and the operator can also be very intuitive and can simply and accurately complete the compilation and management of all language rules. Therefore, the task implementation efficiency in the entire compiler and debugging work system can be greatly improved, and the overall work efficiency of the entire language knowledge base work system can be improved more efficiently, quickly, and accurately. The overall frame structure of the system is shown in Figure 3.

The knowledge management and debugging function interface that comes with the application system is responsible for two parts: knowledge base establishment, maintenance, operation, and knowledge translation and debugging management of natural language processing [18]. The system administrator can directly display the process of generating any syntactic component of the system and other syntactic feature nodes (attributes and values) corresponding to the system according to their needs.

4.1.3. System Translation Algorithm and Workflow

Many times, when people learn a new language, they learn it the way they did before. In many cases, it is often necessary to solve new grammatical problems based on the experience of people using some translation from the original language [19]. The main language translation information screening workflow involved in this translation system is roughly shown in Figure 4.

Among them, the rule method mainly refers to the translation design method based on transformation, which follows the translation design principle of generating rules while analyzing [20]. The specific translation algorithm is shown in Figure 5.

The input of the original text can use text files, keyboard input, and scan input:

If the text does not have any disambiguation rules, the text is automatically defaulted to the first part of speech. The improved figure parsing algorithm is automatically adopted in the stage of structural analysis. Most of the structural transformation steps use a local subtree transformation algorithm based on a combination of top-down and bottom-up [21]. In the process of constructing the structure, bottom-to-bottom local subtree transformation algorithms are applied, respectively.

4.1.4. Part-of-Speech Tagging and Structural Tagging of Phrases

The part-of-speech process is a scientific process in which people comprehensively analyze and classify the basic part-of-speech system composed of various words involved in the entire sentence structure and standardize their use. There are mainly open parts of speech, which can continuously appear in large numbers of new English words. The four main types of open language parts of speech that appear in the contemporary English-speaking world are shown in Table 1.

Secondary part of speech is also called closed part of speech. Generally speaking, the number of words in these parts of speech remains fixed, and generally no other new words are allowed to be added. There are mainly six types of closed parts of speech in modern English vocabulary, as shown in Table 2.

Several words of different types of parts of speech can be individually combined into the same English phrase form or phrase. For example, in the process of syntactic structure analysis in English, one or several English words can form the following phrase structure in Table 3.

In the process of annotating English sentences, different sets of annotations are usually used in different corpora. In order to make the sentence structure more unique and recognizable, this system does not use standard English part-of-speech tagging rules and tagging sets based on context-free grammar. The annotation sets used in this system are shown in Table 4.

The conjugation forms of verbs in English are more complicated. For example, although the infinitive is a form of the verb, it can also function as a noun and retain some characteristics of the verb, so it is also marked as the infinitive phrase inf alone. The gerund form is also a form of the verb, which is also marked separately with the ger tag [22]. In addition, the past participle form of the verb is also marked with ppl.

In order to facilitate the alignment of English and Chinese words and phrases, the Chinese annotation set is roughly the same as the English one. However, Chinese does not have a variety of verb forms, so at this stage, the Chinese tagging set of this system is a subset of the English tagging set.

4.2. Key Technologies of English-Chinese Translation System
4.2.1. Instance Pattern Matching

The strategy of instance pattern matching is to apply the input sentence and the results of lexical analysis, part-of-speech tagging, and shallow syntactic analysis of the input sentence to match the instance patterns in the library, that is, to calculate the similarity between the two [23]. Select the most basic similar instance pattern in the library as the matching result. At the same time, according to its target language pattern and phrase target pattern generation algorithm, it constructs a sentence and outputs all the translations in the sentence. The algorithm flow of instance pattern matching is shown in Figure 6.

4.2.2. Multi-Engine Translation System

Due to the limitation of corpus size, it is difficult for instance-based machine translation to achieve a high matching rate; people can use a relatively complete standardized translation method, combined with a sample library to take advantage of sequential processing [24].

In view of this, it is natural for people to think of effectively combining the two methods, which will definitely improve the quality of machine translation. As a result, the concept of multi-engine was proposed. The meaning of multi-engine is that a machine translation system adopts multiple translation methods, and each method is an engine, as shown in Figure 7.

It can be seen from the above that we can use the latest and most mature English computer software technology in the world, the advanced and perfect research results of contemporary English grammar theory, and a newly designed English-Chinese bilingual corpus. Constructing a most preliminary and complete contemporary English-Chinese machine language translation application system is the frontier topic that we focus on exploring and solving in this paper [25].

4.2.3. The Overall Design of the Dictionary Base Class

The division principle between the dictionary base class and the dictionary subclass in the dictionary class library: Various operations on the dictionary records of automatic machine translation can be simply and abstractly divided into query, add, delete, and modify operations.

From a dynamic point of view, the subclass derived from the base class dictionary class provides an external calling interface. It can be built in the interface function library model that directly calls the corresponding type of function call. Functions such as direct function calls between the function library and the corresponding subclass functions of the function are automatically converted into the function itself directly calling the functions between the corresponding base class functions [26]. Various operations on external physical files are implemented by the base class. Figure 8 depicts the dynamic model between child and parent classes.

From a static point of view, the data relationship between the subclass and the base class is to map several keywords to one keyword or map several data items to one data item. In this way, the base class can be applied to a wider range. This static relationship can be represented by Figure 9.

5. English Translation System Evaluation and Results

5.1. English Translation System Evaluation

Based on the above proposed machine translation method and analysis, matching, transformation, and object generation methods, we design and implement an English-Chinese machine translation system. The system includes knowledge base subsystem, translation subsystem, knowledge management interface, and user interface. The knowledge base subsystem and translation subsystem are the main parts of the system, which are built on the language model and translation model, respectively, as shown in Figure 10.

Language model theory refers to the research on the definition and description of basic language attributes of natural language features by human systems. It is also a systematic re-understanding of the human natural language grammar model from the perspective of the processing law of speech information. The main types include subject parts of speech, sub-parts of speech, morphological features, semantic classes, syntax classes, sentence features, etc., in English. It is actually the mathematical basis for the application of description algorithms and translation algorithms of system language knowledge.

5.1.1. System Evaluation

In order to test the performance of the machine translation system, we organized relevant experts to form an evaluation team to evaluate the translation quality, translation speed, system maintainability, and scalability of the system. Among them, the dynamic frame time slot algorithm, three-dimensional estimation algorithm, and ant colony algorithm in the big data of the Internet of Things are used to make the results more accurate.

(1) Evaluation Method. The evaluation team consists of 1 machine translation expert, 1 linguistics expert, and 1 business unit expert. The evaluation method mainly adopts two methods: closed test and open test. The closed-ended sentence test means that each expert member of the evaluation team randomly selects some sentences from the sentences in the “English-Chinese Machine Translation Quality Test Outline” compiled by Peking University for closed-ended testing. An open-ended sentence test is a test conducted by English experts on some English sentences designed by them (due to the limited knowledge base of the system, the research group can supplement dictionary knowledge and rule knowledge on-site). After being automatically translated by the system, the experts of the evaluation team give an evaluation level to the translation quality of the translation results of each sentence according to the evaluation standards. Then, according to the evaluation method, the evaluation level is quantified and calculated, and the system translation quality evaluation value is obtained. The contribution of translation speed, system maintainability, and scalability to the system is directly scored by experts in the evaluation group. Finally, the overall evaluation value of the system is obtained, and the evaluation results are given by the evaluation team accordingly.

(2) Evaluation Criteria. According to the evaluation method of the machine translation system of the National 863 Plan, the translation quality is divided into 6 grades as follows: A, B, C, D, E, and F.(A)The translation is accurate, fluent, and clear. The language accurately conveys all the various textual information contained in the translated original content in the text. Except for some small typos, there is no need to deliberately make any modifications or supplements.(B)The translation conveys the information of the original. People can fully understand the true meaning of the Chinese translation without referring to the original Chinese text. However, this kind of translation will inevitably have some problems in many aspects, such as grammar, choice of translated words, Chinese vocabulary expression, and customary translation, which need to be revised repeatedly.(C)The translation has roughly expressed the meaning of the readers’ preliminary understanding of the original content, and the partial translation content is similar to the original text or may have some discrepancies.(D)Part of the translation conforms to part of the meaning of the original text, but the whole sentence is not translated correctly, but the words in the whole sentence of the original text are translated, which is helpful for manual post-editing.(E)The translation does not make sense, or the meaning is completely wrong. However, I always feel that there are so few partial sentences or some partial words that are well translated.(F)All translations are not up to standard.

(3) Level Quantification. The quantitative evaluation of the translation quality level requires comprehensive quantification of the evaluation result level and then scoring. First, quantify the following 6 individual evaluation index grades in the translation quality: A-100 points, B-80 points, C-60 points, D-40 points, E-20 points, and F-0 points. Then input the corresponding Chinese-English reference sentences for each of the above indicators. After the full-text system is translated and processed, the translation expert team will give the corresponding evaluation level to the translation result according to the final translation quality score. Finally, after summarizing, the evaluation level is converted into a score, and the weighted average is calculated. For translation speed, maintainability, and scalability, the evaluation results are given by the evaluation team experts after evaluation. The total evaluation value of the system is obtained by summarizing [27].

(4) Evaluation Results. The closed test is to randomly select 58 simple sentences from the “Outline of English-Chinese Machine Translation Quality Test” compiled by Peking University and conduct a closed test. The open-ended test was tested by 22 simple sentences independently designed by the evaluation team. The overall test results of the three experts are shown in Figure 11.

5.1.2. Distribution Analysis of Syntactic Equivalence Pairs

To investigate the distribution of syntactic equivalence pairs, we count the number and length of syntactic equivalence pairs in the experiments. In Figure 12, we present the statistics on the number of syntactically equivalent pairs.

The length distribution of the syntactic structure nodes involved in the experiment is counted. The length of a syntactic node refers to the number of words and punctuation included in the segment covered by a syntactic node in a sentence. First, we count the distribution of the number of nodes in the syntactic equivalence pairs whose length is between 1 and 50 in the syntactic structure of the reference translation. Then, we calculated the ratio of the number of nodes in syntactic equivalence pairs to all syntactic nodes of the same length at different lengths [28], as shown in Figure 13.

From the above statistical results, we can find three main information. First, in terms of absolute quantity, it is obvious that the number of small-scale syntactically equivalent nodes is much larger than that of large-scale syntactically equivalent nodes. Second, from the perspective of quantity ratio, the ratio of nodes in syntactic equivalence pairs in large-scale nodes is higher than that in small-scale nodes. This shows that among the syntactic structures of sentences with the same semantics, consistent syntactic structures are more likely to appear in the high-level structures where large-scale nodes are located, and syntactic equivalence pairs exist.

(3) Distance Distribution Analysis of Wrong Word Order Conversion. Next, the distribution of distance information in Bilingual Error Word Order Transformation (Order-iBT) is calculated. First, the distribution of the reference translation lexical distance (RDt) of all Order-iBTs in Chinese-English and English-Chinese translations is calculated, as shown in Figure 14.

Figure 14 gives two pieces of information. First, the number of RDts with short distances is much larger than those with long distances, where ranking errors with distances 2 and 3 are the most common. Second, compared with Chinese-English translation, the distance of target language ordering errors in English-Chinese translation is longer; in other words, compared with English, there are more long-distance ordering problems in Chinese.

5.2. English Translation System Results
5.2.1. Test Results

The machine translation system runs on a PC with a P4 processor and a memory of 256 M. When the system is running, the system occupies about 6 M memory and can process about 8000 words per minute. Judging from the translation results of the above groups of test sentences, the system can solve the ambiguity problems in sentence segmentation, word segmentation, concurrent words, ambiguous structure, and translation selection relatively well. Although this is only a targeted partial test, it cannot be concluded that the English-Chinese machine translation system has achieved complete success, it also partially shows the feasibility of the system’s design structure and analysis algorithm.

5.2.2. Advantages and Disadvantages of the System

One of the advantages of the system design is that it divides a translation analysis process into six analysis stages before and after and a translation conversion analysis unit with the number of translated clauses as a basic data. It simplifies the analysis and design steps and design implementation of the translation conversion engine function, and improves the overall translation quality accuracy and translation analysis efficiency. The coding scheme used by machine dictionaries describes the syntactic, semantic, and contextual conditions of words in an intuitive and concise manner. A rich source of knowledge provides comprehensive support for disambiguation at all stages of the translation process.

The disadvantage is that if the system is to meet the actual needs, the machine dictionary must reach at least tens of thousands of entries, and the acquisition and writing of vocabulary knowledge will become the bottleneck of the system. And the speed of operation is relatively slow, which cannot achieve the speed and precision required by the market. Because of the lack of entry, some translations will not be fully reflected in the meaning part.

6. Conclusion

The study of linguistic theory is closely linked with the study of machine translation technology. Linguistic theory is the understanding of the regularity of human natural language, and machine translation just needs this linguistic regularity. Linguistic theories at different levels, such as lexical, syntactic, semantic, and pragmatic, all play a guiding role in machine translation [29]. The current situation is that this guiding role is not manifested significantly. For example, the midsection theory, thematic role, and phrase shift in syntactic theory are very good linguistic laws, but they have not been fully utilized in machine translation practice. Therefore, with the help of algorithms based on the Internet of Things and big data, this paper further analyzes the innovative system of English software, fills in the gaps for it, and uses another solution to correct some of the defects that cannot be expressed at the moment. In the next step, the linguistic theories most closely related to machine translation should be systematically studied and studied, and corresponding computational models should be established to guide the subsequent practice of machine translation.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.


This work was supported by the Youth Project of Higher Education Research in 2021 of the 14th five-year plan of Guangdong Higher Education Society: Study on the Training Mode of Undergraduates Majoring in Translation of Application-oriented Universities in Guangdong Province under the Background of New Liberal Arts (Project No.: 21GQN88).