#### Abstract

This work employs deep reinforcement learning and multi-objective optimization algorithms to the allocation of English distance teaching resources in order to increase their allocation efficiency. Moreover, based on the analysis of current regression correction, this paper discusses the algorithm of partition regression correction in depth, and proposes two different neighborhood regression correction algorithms. The proposal of neighborhood further expands the original concept of partition and solves various problems in partition correction. In order to reduce the model complexity of the neighborhood regression algorithm, this paper proposes to solve the problem through structural risk minimization and principal component extraction. The simulation results suggest that the English distance teaching resource allocation approach described in this research, which is based on deep reinforcement learning and multi-objective optimization, may significantly enhance the English distance teaching resource allocation impact.

#### 1. Introduction

Education assessment, education fundamental theory, and education development are recognized as the three most important study subjects in the area of education today. Among these, education assessment plays a crucial role in education growth and reform, as well as education administration and decision-making, thus it has garnered a great deal of attention from the relevant authorities of several countries. In addition, educational assessment is often known as teaching evaluation or educational evaluation. It refers to the process of determining the value of education by seeking, sorting, processing, and evaluating educational material methodically, scientifically, and exhaustively according to particular value standards and educational goals.

Linked Open Data (LOD) based on Semantic Web and Ontology technology has become one of the most important ways to publish high-quality linked semantic data, which is widely used in intelligent services such as semantic search and personalized recommendation. Linked data connects resource objects described by RDF in the form of URLs, so that unstructured documents are marked as structured data with semantics, so that both machines and users can understand and work together. People can directly obtain digital resources through the HTTPAJRI mechanism (thjng). The resource objects released by linked data technology have the characteristics of sharable, reusable, structured, and standardized, which are conducive to integrating isolated teaching data, establishing links between course resources in the same and different fields, and realizing cross-platform and cross-system communication. Inquiring through ontology reasoning and semantic expansion, semantic comprehension of query requests in a distributed environment is clarified, and knowledge resources needed by users are retrieved from linked data using a semantic index structure. Discover even more Linked Data information. The rich and extensive basic research in semantic science lays a solid foundation for the advanced analysis of semantic structure in various sciences, and enables the Semantic Web to provide feasible ideas for managing all kinds of knowledge data. By studying the semantic data of the World Wide Web, researchers at home and abroad have covered various disciplines such as biological science, information science, philosophy, geographic information, and art. Differences in different fields, levels, regions, and thinking lead to the emergence of large-scale heterogeneous structured semantic data. How to perform efficient data integration, data cleaning and expansion, data storage and indexing, as well as data query, search, browsing, and visualization operations for complex and heterogeneous data knowledge is the focus of knowledge management research on the Semantic Web.

In this paper, deep reinforcement learning and multi-objective optimization algorithms are applied to the allocation of English distance teaching resources, and an English distance teaching resource management system is constructed to improve the effect of English distance teaching.

#### 2. Related Work

The core of the file management system is file storage. Researchers such as Patterson summarized several typical file storage systems, including direct storage, object storage, and disk array, compared the advantages and disadvantages of each storage method, and put forward the prospect of cloud storage [1]. The widespread use of file management systems has resulted in the accumulation of a large number of resources and materials on the Internet, and the rapidly increasing number of files also has higher requirements for storage technology. How to store massive data has become a hot topic [2]. Distributed storage technology with high reliability and scalability was born and promoted rapidly, pushing the development of file management technology to a new level [3]. There are two very popular distributed system frameworks, namely, HDFS (Hadoop Distributed File System) and FastDFS (Fast Distributed File System) [3]. HDFS can meet the system design requirements of high throughput and large amount of data. In comparison, FastDFS is a lightweight distributed file system [4]. These two distributed storage systems have their own merits and need to be used in combination with specific application scenarios. In recent years, cloud computing has become a research hotspot in the computer field [5]. As a new concept derived from cloud computing, cloud storage is also favored by researchers [6]. Cloud storage combines virtualization and distributed storage, which can organically integrate a large number of storage devices and provide file storage services to the outside world [7].

Structured data and unstructured data are two types of data that may be classified based on how they are stored. The database system has performed admirably in terms of storing and retrieving structured data. Unstructured data storage, administration, and retrieval, on the other hand, are impossible to perform directly via the database system. In response to this need, many full-text retrieval methods have been developed. One of the key research paths in network retrieval is full-text retrieval technology [8]. Some mature technical frameworks have also emerged, and now the most widely used is Lucene, a sub-project of the Jakarta project group of the Apache Software Foundation. It is an open source full-text search engine toolkit. The purpose of Lucene is to provide software developers with a simple and easy-to-use toolkit to easily implement full-text search functions in target systems, or build on it a complete full-text search engine.

The online teaching system based on WebRTC and Node.js, starting from the fundamental mission of teaching, cultivates all-round talents [9]. Therefore, in this online teaching system, the original intention is to cultivate students’ learning ability, and the Internet is used as a means to guide students to recognize the world, understand the world, and learn knowledge in an all-round way, stimulate students’ initiative and interest in learning, and encourage students to learn. Learning through exploration, in this online teaching system, not only helps teachers impart knowledge to students but more importantly improves students’ communication skills and problem-solving skills [10]. Among them, WebRTC integrates the best audio/video engine. The ultimate purpose of the WebRTC project is mainly to allow web developers to easily develop real-time multimedia applications with browsers. Web developers only need to develop simple JS through the media stream processing process. The code can be implemented [11]. Compared with other teaching systems, WebRTC has good real-time performance in audio and video, so it can better improve the communication experience between teachers and students during class. Thanks to the powerful package manager NPM in Node.js, it takes less time to build a web program than Java, Node.js development is more efficient than Java, and Node.js runs based on the event loop mechanism, compared to Java has the characteristics of high concurrency and high I/O, which is more suitable for dealing with the problem of excessive server overhead caused by the rapid growth of access volume in the current teaching system [12].

At present, there are usually two ways to transmit multimedia information such as video and audio on the network: downloading and streaming. When adopting the download method, the user inevitably faces two problems: the problem of limited storage capacity of the client and the problem of playback delay. Streaming while downloading and playing overcomes the shortcomings of downloading first and then playing, saving time and storage space, and making it possible to learn online through audio and video [13]. Streaming media technology is now more and more widely used, and has become one of the most important technologies for network transmission of audio and video data, especially in the network education system based on network technology and with courseware as the main teaching resource. The application of streaming media technology makes it possible for people to learn audio and video through the network. Streaming media technology overcomes the problem of the transmission of massive audio and video data in medium and low bandwidth, making it more and more widely used not only in online education, but even in traditional education. In light of these benefits, the creation of a digital interactive learning platform based on network and streaming media technology, using network flexibility, openness, breadth, and timeliness, may allow students and instructors to engage with one another. There are different means of interactive learning over the network, including not just classic techniques such as e-mail forums, but also real-time vivid learning exchanges through audio and video [14]; not only streaming media courseware but also online learning. Learning effects are assessed using tests. This not only overcomes the limitations of time and space but also mobilizes students’ interest and enthusiasm for learning, making personalized learning possible. Online education enables students to choose the teaching resources they need most according to their own needs, combined with their own interests and existing knowledge structure, and to learn independently without being bound by time and space [15].

#### 3. Image Color Correction based on Deep Reinforcement Learning

A new three-dimensional interpolation algorithm based on fuzzy logic is proposed, and new interpolation range and interpolation rules are defined. On the basis of information theory, the literature generalizes the tetrahedral interpolation algorithm, and proposes a linear interpolation algorithm based on the maximization of probability entropy for color correction. The method is novel in design and achieves better results than the traditional tetrahedral interpolation algorithm. Furthermore, starting with the neighborhood of corrected color points, this part determines the k-nearest neighbor fuzzy entropy on the sample set. An interpolation approach to increase the maximum fuzzy entropy estimate is provided based on an investigation of the physical properties of the device’s color gamut. The interpolation algorithm utilizes multiple sample points and proposes constraints based on fuzzy logic. It does not need to locate the corresponding geometry, which solves the shortcomings of the tetrahedral interpolation algorithm.

We assume that the color points participating in the 3D value are respectively, where is the color point in the source color space, and is the color point mapped by in the destination color space. Now, we assume that the color point of the source color space to be corrected is , then the generalization process of the three-dimensional duster value is: (1) It finds a weight that satisfies formula (1). (2) It substitutes into formula (2) to obtain the mapping estimate of .

Usually, the tetrahedrons in the tetrahedral quantization algorithm are divided by cubes. The literature simulation experiments show that the accuracy of the tetrahedral value algorithm and the cube value algorithm is not much different, but the tetrahedron value algorithm has advantages in execution cost and calculation speed compared with the cube interpolation.

In information theory, entropy is used to describe the average amount of information in an observation space. This criterion is called the principle of maximum entropy. We assume that the discrete random variable is *x*. When there is probability, the principle of maximum entropy can be expressed as:

The mentioned algorithm based on maximum probability entropy estimation combines the weights in formula (1) into the form of probability entropy, as shown in formula (3). Then, the value that satisfies the constraint of formula (1) is obtained, and finally the final estimated value is obtained. It is easy to find that formula (1) already contains the constraints in formula (3).

There are many definitions of fuzzy entropy, and nearly 20 kinds of fuzzy entropy are introduced in the literature alone. The fuzzy of the fuzzy set A consisting of *x* should satisfy the following five conditions:

Among them, is called the sharpening set of A, and the following two conditions are satisfied:(1)If , then there is ;(2)If , then there is ;

Although many fuzzy entropies have been defined, not all fuzzy entropies satisfy the above five conditions. The following fuzzy entropy is proposed, which satisfies the above five conditions.

Among them, C is a constant, which can usually be set to 1.

So far, the k-nearest neighbor fuzzy entropy has been defined, and formulas (1) and (4) can be combined to obtain a maximum fuzzy entropy algorithm, which is denoted as LIMFE-1 and is the initial stage of the LIIMFE algorithm.

After considering the interpolation coefficient as the degree of membership, the value of should be between 0 and 1, which is used to represent the similarity between the interpolation sample point and the corrected color point. For the constraints of formula (3), the sum of is required to be equal to 1. This constraint can be thought of as the corrected color point being in a convex hull formed by its k-neighbor sample set.

The color gamut surface is usually not convex, for example, the color gamut surface of the English teaching resource database is uneven. Figure 1 shows the network diagram of a color gamut of an English teaching resource database wrapped in Lab space.

We assume that an existing corrected color point is at the concave surface of the color gamut, and the k-nearest neighbor formed by the interpolation candidate point is obviously concave, which conflicts with the above assumption about the convex hull.

In summary, the restriction on the sum of weights being 1 in formula (1) can be cancelled, that is, the constraints in formula (3) can be abandoned, and the new constraints are:

Linked to the centroid defuzzification method in fuzzy logic, the estimated form of formula (2) is changed to:

Among them, . Here, .

So far, the improved linear interpolation algorithm for maximizing entropy (LIIMFE) based on fuzzy entropy can be reduced to the process of constraining formula (5) and maximizing formula (4). The final estimated form of LIIMFE is shown in formula (6).

For the selection of C in formula (4), *C* = 1 can be selected in general. If in order to enhance the final correction effect, structural risk minimization in statistical learning can be used. Structural risk minimization aims to minimize the risk functional for both empirical risk and confidence bounds. The loss function in the least squares method is regarded as the empirical risk of regression, and the loss function in the maximum likelihood method is regarded as the empirical risk of density estimation. Here, the formula (7) can be considered as the empirical risk of the maximum fuzzy entropy algorithm.

The SRM model selection criterion in statistical learning shows that the estimation error is composed of empirical risk and penalty factor, and the product form in model selection is shown in formula (8). Therefore, the constant C can be considered as this penalty factor.

Among them, is the model complexity (VC dimension in statistical learning theory), and *k* is the number of sample sets. The literature gives the estimation formula of model complexity in k-nearest neighbor regression.

The error is obtained by the Lab value of the correction point and the estimated value of various algorithms, and its formula is shown in formula (9).

When multiple regression is applied directly to the neighborhood correction, the result is the same as the partitioned regression correction split into tiny partitions discussed above, and the accuracy is not better, but worse. Because each component of the mapped data has a high correlation, this is the case. We take the Lab sample set data during a color correction of an English teaching resource database as an example, the correlation coefficient between the first component and the fourth component in formula (5) is 0.9996, and the correlation coefficient between the second component and the seventh component is 0.9945. When the variables in are highly correlated, the determinant is almost close to zero, and the inverse of will contain serious rounding errors. The calculation of formula (6) is unreliable, which is a part of the reason why the correction accuracy of the partition regression correction is poor when the partition is small. In addition, when the partition is small, the number of samples is relatively small, and the use of a more complex regression model will make the model overfitting, which will affect the regression prediction accuracy of the model on the test set.

To get a better approximation, we define a loss between the ideal response *y* given an input **X** and the response given by the learning machine, as shown in formula (10):

Considering the mathematical expected value of loss, formula (10) is called the risk functional, where is the loss function and is the joint probability distribution function. The goal of learning is to find a suitable function that minimizes the risk functional. Specific to the least squares method in multiple regression, the formula is used as the loss function to minimize the risk functional. Usually, the method of minimizing the risk functional in this way is called the empirical risk minimization principle (Empirical Risk Minimization, ERM), that is, the empirical risk of multiple regression is

The algorithm in this section further adopts the principle of structural risk minimization in the neighborhood to obtain the regression coefficient, that is

Among them, is the empirical risk, is the confidence range, *h* is the VC dimension, and *k* is the number of samples. Combined with formula (11), it can be found in formula (12) that the confidence range constrains the empirical risk of the least squares method, making nonzero. Thus, the problem of correlation of data components in smaller partitions or neighborhoods is avoided, and the complexity of the regression model is limited.

Another representation of least squares in multiple regression is: the overdetermined formula is considered, and its solution makes . That is to say, the least squares method assumes that there is an error : in *y*, and the solution is to minimize the sum of squared errors of , that is

The least squares method only considers the inaccuracy of , however, also inevitably contains noise in practical problems. Specific to the color calibration, the measurement of the spectrophotometer and the printing paper can bring noise. The input *X* is often not accurate, that is, . In order to solve this problem reasonably, the total least squares method is proposed.

The solution of the total least squares method is

Using the Lagrange multiplier method, the optimization problem in formula (14) can be transformed into:

From formula (15), it can be considered that the empirical risk return of the total least squares method is

Figure 2 shows the difference between least squares and total least squares in the one-dimensional case. It can be seen that compared with the least squares solution, the total least squares solution has a shorter direct projection distance, and its residual is perpendicular to the regression line, and the residual is composed of the errors of *X* and *y*. Therefore, the regression error of the full least squares method is smaller than that of the least squares method.

Following the explanation above, it is recommended that when doing color correction, the residual of the full-squares approach be used as the empirical risk in structural risk reduction. On the one hand, the structural risk reduction concept decreases the real risk error by eliminating the connection between the surrounding data components. The complete least squares approach, on the other hand, accounts for flaws in both the input and output data, making the correction more accurate. That is, the algorithm in this section can be described as:

We choose a penalty term with shrinking variables for confidence in structural risk minimization.

Then, formula (17) becomes:

The output data of formula (19) (the target color space data in color correction) is one-dimensional output, but the target color space data is often multi-dimensional. In the actual calibration, a simple method is to use the algorithm to obtain the regression correction coefficient for each dimension of the target color space.

The concept of feature space originates from classification algorithms. In order to generalize the linear classification algorithm to the nonlinear classification algorithm, the points in the Euclidean space can be mapped to a defined inner product. Moreover, it is a complete normed linear high-dimensional Hilbert space *H* (feature space), and its mapping relation is .

For an algorithm in space , it can be considered to use a new sample set to calculate in the new Hilbert space, as shown in formula (21).

It is generally difficult to estimate the specific form of , so a kernel function is introduced for this purpose. The kernel function transforms the nonlinear problem of the original space into a linear operation involving only the inner product operation in the feature space. The process can be written as:

The trick of the kernel function is that it does not need to know the specific form of , and it directly calculates the inner product of and through .

Some commonly used kernel functions include (where are all real constants):(1)The homogeneous polynomial kernel function is . The inhomogeneous polynomial kernel function is .(2)The Gaussian kernel function is .

The kernel function has a simple property: the linear combination of the kernel function is still a kernel function.

In the process of color correction, all color spaces are Euclidean spaces. If the data of the source color space is mapped by the nonlinear mapping , it can not only convert the nonlinearity of the source color space data into linearity but also provide additional correction information for color correction. Taking Lab data as an example, we use the properties of the above kernel function and select the kernel function as , then the corresponding nonlinear mapping is

Formula (23) is very similar to the expansion term of the polynomial in the multiple regression color correction method. The terms of the polynomial in the multiple regression are shown in formula (24).

When performing scanner calibration, it is pointed out that the average error and standard error of the calibration will decrease as the degree and number of terms of the regression polynomial increase. This shows that the data after mapping can provide more correction information than the data before mapping. In addition, from the point of view of basis, it is still a question whether the global polynomial is the best choice to describe the nonlinear correction, and the kernel function can provide more choice space, which includes local and global kernel functions.

Because partial least squares regression is performed in the feature space after the kernel function mapping, it can be directly calculated by kernel partial least squares regression (KernelLeastSquaresRegression, KPLSRegression). The steps of KPLS regression can be expressed as:(1)The algorithm initializes the vector ;(2) , where is the matrix that maps the training data to the feature space;(3) is the output data matrix;(4);(5)The algorithm repeats steps (2)–(5) until convergence;(6);

Among them, $u and t$ are the main components. According to formula (22), step (2) can be changed to , and step (6) can be changed to , where is the Gram matrix expanded by formula (22). We write as principal component matrices U and *T,* respectively, and the KPLS regression coefficient can be obtained as: is the matrix that maps the test data to the feature space, then the estimated form of its KPLS regression is

#### 4. Allocation of English Distance Teaching Resources based on Deep Reinforcement Learning and Multi-Objective Optimization

The meaning of unit management and subject management in the system platform design mainly refers to the realization of classified management by implementing unit and subject management, guiding teachers and logging in customers through tracking access. For example, according to different personnel, the system platform or all information is fully open to it, or partially open to it. From the perspective of the management of English distance teaching resources for managers, classified management is to implement classification for the management platform of college teaching English distance teaching resources in the system English distance teaching resource database. It formulates the directory structure of the remote teaching resource management platform for teaching English in colleges and universities, and then selects different files for classification. On the one hand, if instructors are classified as managers, it is important to create English distance teaching resource monitoring records. On the other hand, as indicated in Figures 3 and 4, it is vital to assess the demand for English distance teaching resources and then conduct data statistics to better understand the purpose of English distance teaching resource.

##### 4.1. Management for Instructors

Figure 5 is a schematic diagram of the directory structure, only teachers or managers have the right to change, and the designer can change the directory tree structure at any time, which has the dynamic characteristics of node change. There are some unique styles of English distance teaching resource management based on colleges and universities in its own field.

According to the characteristics of the remote teaching resource management platform for teaching English in colleges and universities, the storage and management of text-based English remote teaching resources and file-based English remote teaching resources are realized with the help of the database and file system. The storage structure of English remote teaching resources is shown in Figure 6.

The main function of the input design of the system database is to define and input information. It is about the design of what kind of English distance teaching resources and how to input into the system. Moreover, the input design must pay attention to the quality of system data. In this context, the correct search term setting is very important. Once the entered search term is not related to the original data information, no matter how well the system performs, no matter how advanced the system technology is and how appropriate the search tool is. At the same time, the final search content is also likely to be inconsistent with the user’s requirements, which will directly affect the functioning of the system. Figure 7 shows the flowchart of the query module of English distance teaching resources.

In the process of using the system, the first display is the user login module, which requires the user to enter information such as user name and password. The goal is to verify the system user’s identity, increase system security via the user login operation, and prevent unauthorised users from accessing the system without permission. Therefore, the module is displayed in the form of a login interface, and only the correct input of the two required options of conventional name and password can successfully log in to the system and use two identities of administrators and ordinary users to distinguish the use rights. The design interface of this module is shown in Figure 8.

Figure 9 is a schematic diagram of the resource allocation of English distance teaching based on deep reinforcement learning and multi-objective optimization, which is expressed in the form of simulation.

The English distance teaching resource allocation method based on deep reinforcement learning and multi-objective optimization proposed in this paper is simulated and evaluated by Matlab, and the results are shown in Table 1.

From the above research, we can see that the English distance teaching resource allocation method based on deep reinforcement learning and multi-objective optimization proposed in this paper can effectively improve the effect of English distance teaching resource allocation.

#### 5. Conclusion

At present, the main form of online teaching is the sharing and use of teaching resources. However, the sharing and use of teaching resources is inseparable from a unified management platform to integrate, classify, and manage teaching resources on the network. Based on this, it is not difficult to see that the research significance of the teaching resource file management system is that it provides basic support for network teaching and is of great significance to the construction of educational informatization. In recent years, the popularity of the Internet has become higher and higher, the network teaching resources have also accumulated a certain amount, the use of teaching resources has become increasingly diversified, and the research on the storage, classification, and utilization of teaching resource files has also increased. At the moment, China is still in the early stages of developing educational resource pools. Many great teaching materials have not been successfully integrated and utilized, and the dissemination of teaching resources is excessively dispersed and fragmented. To some degree, this obstructs the promotion and growth of online educations. In this paper, deep reinforcement learning and multi-objective optimization algorithms are applied to the allocation of English distance teaching resources, and an English distance teaching resource management system is constructed. The research shows that the English distance teaching resource allocation method based on deep reinforcement learning and multi-objective optimization proposed in this paper can effectively improve the resource allocation effect of English distance teaching.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the Research Project of Teaching Reform in Colleges and Universities in Hunan Province: Research and Practice of Online Teaching Reform of Advanced English Course based on Flexible Teaching and Active Learning(AS2021), Research Project of Science in Colleges and Universities in Hunan Province: An empirical study on the Formative Assessment of Students’ English writing based on the Chinese English Proficiency Scale (21C0797).