Abstract

Language translation is often conducted in work and study. Traditional language translation is based on lexical structure analysis. However, natural language is not so standardized, which causes this translation method to have fundamental defects, no matter how much the algorithm is improved. The translation results and human translation will be very different. This paper mainly studies the networked artificial intelligence. The English translation system and translation methods are based on a smart knowledge base. Bringing an example of English-Chinese translation to suggest translations according to the intelligent knowledge base explains in detail the principle of intelligent knowledge-based translation and the advantage of this translation method compared with the traditional translation method based on lexical structure analysis. In the experiment of this paper, when the variance is 2/N, 30/N, 100/N, and 2N, it is the experimental data for an in-depth study. When the variance is 2/N, 30/N, and 100/N, the result is the same as that when the variance is 0.5; the result when the variance is 2N also conforms to the trend in the tables, which is close to the effect of the smoothing algorithm, which verifies the effectiveness of the system in this paper.

1. Introduction

Many aspects of artificial intelligence technology are related to intelligent robots. Many of the problems that artificial intelligence has to deal with are just the ones that smart robotics technology has to solve, and many achievements of artificial intelligence are also reflected in intelligent robots. English translation is also an application area of artificial intelligence. Translation is a task often accomplished in learning and scientific research. Human translations are in high demand for translators. Also, it will take a lot of time and energy. Therefore, people tend to choose software translation. However, most of the results are not available immediately. Although the algorithms of translation software are constantly improving, the quality of translation has not improved a great deal.

Human society is a complex whole composed of multiple cultures. The mutual influence and penetration of various cultures has promoted the progress and development of the human society. In today’s information society, the main carrier of intercultural communication is language; therefore, translation between different languages becomes the key in the process of cultural communication [1, 2]. As a result, translation is also one of human wisdom activities that have existed since ancient times and is an indispensable means of information exchange in the modern society. High-quality translation services are undoubtedly of great significance for China’s social development, especially economic development [3]. This project takes English-Chinese translation as an example and combines the ideas in the field of artificial intelligence to propose a new idea of English language translation based on intelligent knowledge base.

Sangeetha presents a speech-to-speech (SST) translation system that focuses primarily on translating English into Dravidian languages (Tamil and Malayalam). The key technology involved in SST is continuous speech recognition. Compared to SVM and AANN, HMM produces better results. Therefore, an English-based voice recognition machine was introduced, proposing a hybrid automatic translation system (combination of rules and statistics) to convert English text to Dravidian text. He has offered a text-to-speech (TTS) syllable in Tamil and Malayalam. Prosody is predicted to use AANN in Tamil to improve spontaneity and clarity. This section is limited to proposals covering statements at train stations, bus stops, and airports. This work defines a new translation method framework for translating English into Dravidian based on HMM CSR, and the efficiency of each of the Hybrid MT and TTS units has been improved. This will improve the overall speech translation performance. It helps to create parallel bodies for these Indian languages. Also, the proposed speech translation system can be applied to English in all Indian languages [4, 5]. Zheng solved the problem of automatic dictionary translation. His proposed approach would translate dictionaries by exporting the source and target log files without directly linking the two. It consists of two steps: (1) synonyms dictionary translation and graphical word comparison and (2) translation through similar semantics to compare the environment in his experiment. The Polish and English wikis were used as a draft of the text. It analyzes methods and procedures in detail, and the results allow this method to be applied to the intermediary system [6, 7]. Deyi Xiong believes that word translation is very important for machine translation. In his research paper, he studied three areas of dictionary translation in a statistical computer translation environment and three coherent models were proposed: (a) a word clarification model for word translation; (B) a word translation matching model that supports the translation of terms with high translation consistency throughout the document; and (c) a lexical concordance model rewarding the original word translation hypothesis translated into the entire target string. We combine these three models with SMT hierarchical phrases [8, 9], and the effectiveness of the NIST Chinese-English translation was evaluated by using large educational data. His experimental results showed that all three models were able to achieve significant improvements above the starting line. His analysis also showed that the proposed model could improve word translation [10, 11]. Ankit Srivastava attempts to minimize these types of errors by interfacing statistical machine translation (SMT) models with linked open data (LOD) resources such as DBpedia and BabelNet. He conducted multiple experiments based on the SMT system Moses and evaluated various strategies for leveraging knowledge from multiple languages to link data in automatic translation of named entities. Finally, he analyzed the best practices for multilingual linked datasets to optimize their benefits for multilingual and cross-lingual applications [12, 13]. S. Ye believes that the language model statistically improves the efficiency of machine translation. However, the monolingual body is not the same as the neural machine translation that is effectively used to deal with this problem. He proposed a semicontrolled model of neural machine translation based on the selection of bilingual assessments at the phrase level (BLEU). First, untranslated data translations are created through machine translation statistics and models of neural machine translation, respectively. The candidate translation is then selected by BLEU at the proposal level, and the selected candidate translation is added to the dataset labeled for semitranslation training together under supervision [14]. His experimental results demonstrated the effectiveness of the proposed algorithm in the use of unlabeled data. In NIST’s Sino-English translation project, this method was a significant improvement on the baseline with only good label information [15, 16].

Innovations in this fair: different machine learning methods provide a learning framework. In the guided learning method, the original HMM was improved first and offers a more accurate HMM profit model. The transformation function is used to introduce various context information into the training model. While improving the model description ability, the training and labeling process consistent with the original model is maintained. Experimental results show that the new model has greatly improved the original model regardless of Chinese or English chunk recognition. Then, the implementation details and performance analysis of SVM algorithm are discussed. SVM algorithm is currently recognized as one of the best text classification algorithms. In this paper, the SVM also obtained the best results in the block recognition results of each single classifier. The comparison of each order of the polynomial kernel function found that the SVM does show good generalization ability in the high-dimensional feature space. In comparison with the gain of HMM, it is found that the SVM has a unique advantage in small sample statistical pattern recognition. The stress is also on direct push SVM and multiclass recognition algorithms.

2. Proposed Method

2.1. Thoughts and Practical Language Based on Intelligent Knowledge Base Translation

As merely a carrier for human expression of ideas, it has no meaning itself, but humans have given it meaning, so the basic unit of translation should be text instead of word structure. A text may be a sentence, a segment, or a group of sentences. It may also be a word; translation is a form of linguistic communication, and the essence is to replace text in one language with a textual meaning in another language as long as the meaning is the same. It can be replaced without exploring word parity or grammar [17, 18]. As is shown in Figure 1, translations are usually semantic which is to convert semantics from the source language to target language instead of form. For example, discourse 1 in Chinese and discourse 2 in English both represent semantic A; then, they can be replaced with each other regardless of differences between words, phrases, and grammatical structures [19, 20].

Knowledge base-based translation means that translation is always based on semantics, with semantics as the goal, and searches for equivalent or similar texts from the knowledge base [21]. This is closer to the human way of thinking. When a person translates “where are you from,” he first searches in the brain and finds that the semantics of the text “where are you from” is the same as the semantics of “where are you from” and then returns immediately the result.

2.2. Machine Translation Model
2.2.1. Rule-Based Machine Translation System

The main research of machine translation is very regular translation. Although statistical-based methods have some impact on rule-based methods, this has not hindered the development of rule-based methods [22]. The rule-based method mainly relies on the language knowledge base, among which the formulation of rules mainly depends on the artificially established language knowledge base, and linguists repeatedly debug and modify the rules in terms of syntax, semantics, and other related aspects. Traditional rule methods use less corpus, which inevitably results in low rule coverage and rule conflicts [23]. It improves the methodology according to the rules. The setting of rules has increased the percentage of storage knowledge, i.e., common rules and methods come from the compilation of multiple cases and the analysis in data warehouses, such as common error-based learning algorithms. Also, the traditional rule-based method pays more attention to the natural language knowledge of coarse-grained, globalization, and large rule base, but the current algorithm is in the direction of “big dictionary, small rule base” [24]. In terms of displaying knowledge, today’s rule-making places a greater emphasis on the relationship between the source language and context than traditional approaches. Most of the traditional rule-based methods use the principle of determinism, that is, either the one or the other, resulting in poor system robustness. The current rule algorithm will introduce a scoring function or a probability function (such as IBM’s BLEU), which will have a better effect on changing the robustness of the system. When English sentences are directly translated into Chinese, the logical order of English sentences is different from that of Chinese words. Therefore, the rule-based translation method usually adjusts the word order of the translation rules, and sometimes requires multiple levels of order. Rule-based machine translation methods involve multiple rules, such as overlapping word rules, segmentation rules, syntax analysis rules, tagging rules, semantic analysis rules, word conversion rules, structural conversion rules, and word claim rules.

Routine translation methods can be divided into three types: the direct system, common language system, and translation system. The main difference between these three types is an in-depth analysis of the source language. The common point is that they must use a large number of word net, large-scale source language analysis rules, intermediate language conversion methods, and target language generation rules. Among them, the design principle of the direct translation system is based on one-to-one correspondence between the source language (SL) and the target language (TL), so the direct translation system is neither reliable [25]. In order to solve this problem, some current transliteration systems will also add some semantic factors to the target language or add some rules to improve the accuracy and readability of the translation. Most of today’s language systems are divided into two types: knowledge-based systems (KBMT) and nonknowledge-based translation systems. Common knowledge-based translation systems are Wilks (mainly composed of a tree structure composed of semantic units) and KBMT from Yale University and Colgate University. Nonknowledge-based systems do not need to rely on a specific knowledge base, but instead build an expression tool. The more common ones are DLT, CETA, and Roaetta. Among them, the conversion system analyzes the source language (SL) and target language (TL), respectively, and the analysis of SL and TL is performed only at the syntactic level by large-scale bilingual dictionary and corpus. Among the rule-based machine translation systems, the more famous experiment is the weather forecast translation by Isabelle in 1987.

2.2.2. Example-Based Machine Translation System

The principle structure of the system is shown in Figure 2. The main idea is to first build a large-scale bilingual corpus (translation memory), match the most similar example sentence according to the matching principle in the constructed example database, find out corresponding target language of the similar sentence example, then correct and reform the target sentence, and finally, get the target sentence corresponding to the source language sentence.

In the example translation process, the most critical work is to find the matching process, and the source language does not need to be analyzed during the search process, only analog matching is needed [26]. The research method of similarity in case-based machine translation and the calculation of sentence similarity are introduced in Federica’s article. In natural language processing, the similarity of sentences is widely used, such as in information retrieval, search engines, and machine translation systems. The main idea is to extract the keywords of the sentence, then remove the redundant knowledge of the sentence, and calculate the similarity of the sentence by comparing the similarities of the keywords. The sample-based automatic translation system has achieved very good results in terms of scalability and reliability, and the matching process uses “fuzzy matching” instead of exact matching, thus improving the accuracy of the model [27]. In general, translation systems are composed of translation models and translation knowledge. Among them, translation knowledge constitutes the main part of the system, and how to analyze and synthesize translation knowledge has become a key issue in building a system.

2.3. Algorithm Design of the Intelligent Translation System
2.3.1. Analysis of Semantic Features of Translation

In order to implement a network-based intelligent translation system based on human-computer interaction, first the software development design of the intelligent translation system is combined with the method of mining semantic association rules to perform intelligent semantic filtering control to improve the intelligence of language translation, assuming the text distribution of sentences to be translated. The result of calculating the similarity is

The word translation semantic structure restructuring method is used, the ontology structure analysis is conducted, and the ontology mapping definition is obtained as D, , and the similarity of the name information and data type information of the translation system is described as

is defined as the structural information in the sentence translation process, and the text semantic subject word recombination method is used to analyze the graph model in the machine translation process to construct the semantic similarity feature quantity of the entire ontology mapping process as

is defined as the context word similarity, and list mining is often used to group the text structure of machine translation into

According to the mapping relationship between different models, a set of vague correlation rules close to similar messages is received. In a set of categorical relationships, a conceptual set of topics is obtained:

An equivalent semantic mapping relationship is established. The estimated result of the topic concept set is as follows:

An unusual interdisciplinary assessment is conducted with the results of the semantic nature of the translation. Assuming a language evaluation set of [0, T], the translation system must adhere to the normal vector set S, and the number of elements in the two ontology fragments labeled in the evaluation set S is , and the semantic feature vector β in the set rel can be represented by the following Δ function:

Among them, the round is a set of semantic mapping features between gerontologists, which is represented as binary distinct vectors. The ontology semantic structure mapping method is used to filter and control semantic data in the machine translation process. A semantic database of experts is created which includes how to create a subject word attribute for semantic map analysis.

2.3.2. Construction of an Expert Semantic Database

To construct an expert semantic database in a given mapping set, the topic word concept set in the semantic dictionary is calculated, where represents the given mapping set the semantic ontology model; then,

Let X and Y be the topological distribution terminologies of the ontological model, and the utilization rate of vocabulary in the semantic word database is calculated at t (n) :

The fuzzy rule matching method is used to design human-computer interaction and install an intelligent translation system on the E network (E′is the given relationship between concepts in n’); let be the semantic relationship attribute set and be the semantic equivalent relationship category set, where represents the human-computer interaction network translation feature set to be translated and C is , which constructs a list of topic words and performs fuzzy construction in intelligent translation. The association rule set is

A set of semantic distribution ontology with standardized matching is created, and normal term translation technology is used to calculate a vector set of correlation rules for the two term sets to be translated.

The semantic mapping relationships between concepts A and B are analyzed, the eigenvalues of word vectors are analyzed, and an intelligent network-based translation system based on the Google AJAX seArch AP is designd to provide a comprehensive conceptual framework for intelligent web translation [28].

2.4. Fusion of Machine Translations

The basic idea of sentence-level translation fusion technology is to integrate the N-best input translations from multiple translation systems for the same original language sentence and reorder them in the integrated set, and finally, after reordering, the best translation is the output translation [29]. Among them, the sorting criteria can refer to a variety of information, such as the Bayesian risk of the translation, the score of the language model of the translation, or other statistical feature values. Phrase-level translation fusion technology generally requires going deep into the process of machine translation systems, for example, capturing the translation of phrases used by each translation system during translation. Based on the internal information, the phrase-based translation fusion technology can achieve the fusion of translation system capabilities through redecoding. The basic idea of lexical-level translation fusion technology is that, for the same source language sentence, high-quality translation fragments are selected from N-best translations from different machine translation systems and then fused into a brand new higher-quality translation [30]. In the lexical-level translation fusion technology, the most popular method is the decoding method based on the confusion network. This method uses an obfuscation network to align multiple translations and then searches for the best path in the obfuscation network to obtain the optimal combination of fragments. The confounding network-based dictionary-based fusion method is better than other translation fusion methods in specific tasks [31]. Among these three different translation fusion strategies, the sentence-level and lexical-level fusion methods are the two more popular translation fusion methods. These two methods have their own advantages and disadvantages in different aspects. Among these advantages and disadvantages, robustness and effectiveness are the two most important indicators. Here, robustness means that, for different experimental conditions (such as different experimental corpora or different input translation systems), the method can obtain a more stable fusion quality, while effectiveness means that the method can obtain relatively stable high fusion quality. In practice, robustness and effectiveness often show a certain opposite trend. Better robustness often sacrifices certain effectiveness, and better effectiveness often cannot directly guarantee better robustness.

2.5. Methods Based on Transformational Learning

It was first used for part-of-speech tagging, and the results obtained were comparable to statistical methods. Some people have used this self-learning method to identify basic noun phrases (baseNP) in English. The core idea of the transformation-based learning method is to change the current local structural state according to the learned transformation rules and the current transformation trigger conditions. The initial state of the input is usually preset to some optimal default value. The greedy algorithm is used in the training, and the rules are adjusted to minimize the error rate in each iteration. This algorithm has a serious shortcoming; that is, the training time is too long; especially, when the size of the training corpus is large, it performs more obviously. Some people have proposed an improved algorithm (nTBL), and the training time can be greatly shortened on the premise that performance is not affected. Its core idea is that, after selecting the best conversion rule and applying it to the corpus, it is not necessary to score all the rules because only a part of the rules applied to the converted corpus will change the corpus; that is, their scores will change. The scores of most of the rules have not changed, and the rules of this part are determined as follows: First, the examples that apply the best conversion rules are found to make changes in the corpus, so that these examples can be found in the corpus. Context is also known as neighbor examples because neighbor examples affect each other. Changes in any one example will affect the surrounding examples, so the rules containing these examples apply to the transformed corpus, which may cause the corpus changes, so we only need to rescore these rules, which shortens the training time.

3. Experiments

3.1. Experiment Purpose

What this article needs to build is a system that translates English into Chinese. The main aim is to create a rule-based system in a short time by using existing related tools for processing natural language, so this article mainly introduces the use of related tools.

3.2. System Interface Design

This experiment is configured in the Windows operating system. All commands and functions are executed in the Windows interface. The Chinese-English Aligned Body is based on instructions at http://www.nlp.org.cn. The tool is GIZA ++, but in general, GIZA ++ runs on the Linux platform, and this article uses a batch processing tool that integrates GIZA ++ into Windows. Tokenizing in English is different from Chinese word segmentation. English word grouping is based on spaces and punctuation marks. Some also deal with the sentences contained in them, such as the extraction of certain words (Let’s into Let us), and the final database shown in Figure 3.

3.3. Sample Selection Experiment Steps

In sentence recognition, it is an important question whether every word in each sentence needs to be the research object. In Xavier’s system, he uses the boundary words of each basic phrase as the beginning and end of the clause research methods. However, if all words are used as samples in the training process, some information may be added, so this article conducted a simple sample selection comparison experiment for this problem. The experiment is aimed at the problem of clause recognition, which is divided into four small experiments:(1)The training sample set contains all words, but the test sample set only contains phrase boundary words, and other words are set as nonsentence first words(2)The training sample set contains all words, and the test sample set also contains every word in each sentence(3)The training sample set contains only phrase boundary words, but the test sample set contains each word(4)The training sample set only contains phrase boundary words, and the test sample set only contains phrase boundary words, and other words are set as nonsentence first words

4. Discussion

4.1. Analysis of the English Translation System Simulation Experiment
4.1.1. Accuracy Test Analysis

The center frequency of the phrase taken from the English translation is 12 KHz, the maximum length of the phrase combination is set to 2000Bit, the English semantic concept set is set to 245 samples, and Hownet is used as the lexical entity set, according to the abovementioned simulation environment and parameter settings. The intelligent English translation simulation of the phrase translation combination is carried out. The accuracy of the translation output and the recall rate of the English semantic information are used as test indicators to obtain the results. As is shown in Figure 4, the results of the test were unexpected. Today, with the increasing popularity of statistical machine translation models, we have discovered that the “rule”-based “Huajian” English-Chinese machine translation system developed by Chinese science and technology personnel has far transcended the quality of translations that have won many international evaluations (Google English-Chinese Machine Translation System).

4.1.2. System Performance Analysis

As is shown in Figure 5, the accuracy and recall of the network-based intelligent translation with the system and the performance are better. In terms of quality, the monolingual method can independently give a quantitative evaluation on continuous domains for different linguistic categories and detection points. The method for bilingual conversion is to achieve the diagnosis of the system through binary statistics of positive and error of bilingual conversion. In contrast, the granularity is relatively coarse. This feature makes the monolingual method more suitable for the needs of quantitative analysis on continuous domains.

4.1.3. Comparative Analysis of Application Methods

As is shown in Figure 6 and Table 1, both training and test samples need to consider only phrase boundary words to achieve better results. The four cases are compared and found separately: When only phrase boundary words are considered as the research object in the training model, if the sample contains each word during the test, the effect will be very poor. The analysis reason should be that the training model did not extract all words. Feature information is especially nonphrase boundary words, so errors may occur when discriminating these words. So, in the system realization, this article takes the fourth way.

4.1.4. Analysis of the Influence of Smooth Variance on the Recognition Effect

As is shown in Table 2 and Figure 7, in the experiment, the variance is also taken as 2/N, 30/N, 100/N, and 2N. N is the experimental result in this experiment, and 2/N is taken as the variance. The results at 30/N and 100/N are the same as those when the variance is 0.5; the results when the variance is 2N also conform to the trend in Table 2, which is close to the effect of the smoothing algorithm. The experimental results show that if Gaussian prior smoothing is used, when the variance value is small, the two types can hardly be distinguished, and it is found that when the number of iterations is constant, the value will gradually increase as the variance increases. The recall rate also gradually increased, but compared with the model without smoothing, the effect was still relatively poor.

5. Conclusions

This article introduces the concept of intelligent English translation of artificial intelligence based on knowledge and introduces several implementation technology details. The implementation details are explained so that everyone can have a better understanding of this idea. The translation method based on the intelligent knowledge base is closer to man’s understanding of language, and the translation result is closer to artificial translation in theory. However, because the knowledge base available to computers is still small, humans have a lot of visual and auditory information that is difficult to obtain and analyze. So, there is still a long way to go between translation results and translation by man.

This document designs a smart web-based artificial intelligence translation system. The system consists of two parts: machine translation algorithm design and system software development and design. The method of mapping the semantic ontological structure is used to filter and control semantic data in the machine translation process. The intelligence of translation can be improved by applying the networked information interaction mode to design the human-computer interaction of the intelligent translation system, building a corpus of intelligently translated words, implementing intelligent translation sentence division in the semantic topic concept set, and employing the topic feature matching method to achieve the algorithm design of network intelligent translation. The development of intelligent human-computer translation systems and design networks in the integrated industrial control computer consists mainly of data processing units, AD modules, DMA control modules, and output modules. Test results show that the system can effectively implement intelligent translation on the network, improve the reliability and intelligence of machine translation, and has good application.

This paper examines the translation methods of an online artificial intelligence translation system by using intelligent knowledge to improve English translation intelligence. The system design includes machine translation algorithm design and software design. The combination of semantic feature analysis and phrase translation is used to optimize the automatic translation algorithm in English. The software design of the automatic translation system should be used in the embedded environment. The research shows that the design of the automatic translation system for English in this way can improve the accuracy and intelligence of translation.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare no conflicts of interest.