Abstract
To rationally allocate teaching resources in English teaching, a teaching resource optimization and allocation management method is proposed based on a convolutional neural network (CNN) and Arduino device. By constructing a 9-layer CNN classification and recognition model and the English education resource library of the Arduino device and applying them to the recognition program design of the Arduino device, the rational optimization and allocation of teaching resources are realized. Simulation results show that the recognition accuracy of the proposed method is over 90% for Arduino devices, and the recognition accuracy is over 80% for real English teaching scenarios, which means that the proposed method has a certain practical application value. Moreover, the interaction mode between English learners and English teaching resources is innovated, which contributes to the optimization and allocation of English teaching resources. Thus a new idea is generated to integrate the English teaching resources.
1. Related Work
With the globalization of economics, the communication and connection between countries have been closer and closer. As an international language, English is widely used in the world and it has become one of the main languages for communication between countries. In order to promote the communication between countries, China has launched a nationwide English education campaign. However, due to regional differences, English education resources in China are unevenly allocated. In recent years, online education has solved the problem of uneven allocation of English teaching resources to a certain extent. Ronkowitz Kenneth et al. believed that online education is conducive to integrate the online courses into higher education and balance the allocation of educational resources, which plays a positive role in promoting the education system [1]. SerranoSolano Beatriz et al. took Galaxy as an e-learning platform and realized accessible online education by providing a training material library of high-quality community library [2]. On this basis, data, tools, and other teaching resources are easy to obtain, and learners' teaching resources can be shared. Çoban Atakan applied the Algodoo program to the teaching and evaluation process of physics courses, thus application that can arouse students' attention to the course is designed, which improves learners' learning motivation and facilitates the sharing of educational resources [3]. Zhang Rongbo and Afrouz Rojan et al. constructed the construction paradigm of the online education evaluation model by analyzing the application of the current scientific paradigm. The proposal of a new education concept has promoted the development of a new paradigm, and the paradigm constructs a series of educational evaluation models from macro, Miso, and micro levels, which play a positive role in the research of various aspects of related fields [4, 5]. Ren T and Kim Jihyun et al. improved the ease of use of online networks and further optimized teaching resources with the aid of artificial intelligence (AI) technology [6, 7]. Arnab Kundu reviewed the role of self-efficacy in online education, and thus, an overall framework for strengthening participants' self-efficacy is put forward. At this time, online education becomes effective and impressive. In addition, participants' self-efficacy is easy to be improved, and the rational allocation of teaching resources is realized. The results show that online education plays a positive role in the optimization and allocation management of teaching resources [8]. However, the above studies only focus on the preliminary allocation and management of educational resources and do not conduct an in-depth discussion. To solve this problem, a teaching resource optimization and allocation management method for English teaching is proposed based on the CNN and Arduino device, so as to deeply study the optimization and allocation of teaching resources.
2. Introduction to CNN
CNN is a classification and recognition algorithm proposed based on biological vision, which is often used for the classification and recognition of images, texts, audio signals, and other fields. The basic results are shown in Figure 1 [9].

The input layer is used to preprocess input data. The convolution layer is responsible for learning and extracting sample features of input data, and when invalid data sample features are filtered, the mathematical description is shown in formula (1). The pooling layer includes average pooling and maximum pooling operations, as shown in formulas (2) and (3), respectively, which are responsible for reducing the amount of data to avoid overfitting problems of the model. The full connection layer is the output layer of the CNN, which acts as a classifier and is used to output the final classification calculation results. Its mathematical description is shown in (4).
In formula (1), x and y are the input and output image features; stands for two-dimensional convolution kernel; b represents the offset term; and f(•) is the activation function.
In formula (2), t represents the sequential threshold of activation value. In addition, pooling domain Rj belongs to the jth feature graph, and the index value i of activation value is in . and represent the sequence and activation value of i, respectively. In formula (3), x and y represent pooled input feature and output feature of the convolution layer, respectively; β and b represent multiplicative and additive bias terms, respectively; and down(•) stands for pooled function. In formula (4), y represents the fully connected output; f(•) represents activation function; x represents the input of full connection layer; stands for weight; and b means offset term.
CNN usually adopts the cross entropy function as the loss function, and its calculation formula is shown in the following formula [10, 11]:where p and q are actual value and predicted value, respectively. The higher the value, the better the model performance.
CNN is characterized by parameter sharing and translation invariance and is mainly delinearized by activation function [12, 13]. The activation functions mainly have the following centralized forms [14, 15]:
Among them, the sigmoid function and tanh function usually have the problem of “gradient dispersion” and are more suitable for shallow networks. However, the ReLU function almost completely transmits the gradient to each layer of the network losslessly, so that the weight of each layer of the network can be trained [16]. Therefore, the ReLU function is adopted as the activation function of the CNN model. For image recognition tasks, the parameter sharing and translational invariance of the CNN can effectively reduce nonessential parameters of the network and retain important parameters, so as to make the network achieve a better learning effect.
3. Optimization and Allocation of English Teaching Resources
3.1. Overall Design
Based on the above analysis, the method of optimizing and allocating English teaching resource is divided into two parts: the first is to use the CNN design recognition program; the second is to collect and organize existing English education resources to build an Arduino device resource library. Figure 2 shows the processes.

3.2. Design of Arduino Recognition Program Based on CNN
3.2.1. Design of Recognition Program
The basic idea of the design of the Arduino recognition program based on the CNN is to collect the data set composed by images of Arduino devices to train the CNN , so as to obtain an optimal CNN recognition model whose output value is closest to the actual value. Then, the model is used to classify and recognize the samples to be tested, and categories can be output. The specific idea is shown in Figure 3.

According to the characteristics of Arduino devices, there are two convolution layers and three full connection layers set for the preliminary CNN model. Among them, the last full connection layer adopts the Softmax function, as shown in the following formula:where represents the output value of node , and represents the number of output nodes. When the value overflows, the Softmax function can be optimized by subtracting the maximum value D from the output value.
3.2.2. Optimization of Recognition Program
To explore a more reasonable CNN structure and improve the classification and recognition performance of the model, the CNN model is adopted to optimize the network structure of the mode. There are one convolution layer and pooling layer added, and one full connection layer reduced. The steps of optimizing CNN construction are as follows:(1)Convert input images to meet the requirements of CNN input images. Assume that the initial threshold of image gray value is , the part greater than is , and the part less than is . Calculate the normalized gray histogram ht and normalized cumulative histogram of and , respectively, as shown in the following formulas: where and are pixel gray level; is pixel point with gray level ; M and N are the numbers of pixel rows and columns, respectively. Then, gray mean and of and can be expressed as follows: Calculate and update the threshold as follows:(2)Set the depth of the convolution kernel of the three-layer convolution layer to 64, 128, and 256. In addition, both height and width are set to 7, and the move step is 1.(3)Adopt maximum pooling and obtain the final classification results by the Softmax regression model.
As can be seen, the optimized CNN model is constructed. Inputting the data set into the constructed CNN model for iterative training, the optimal classification recognition CNN model can be obtained.
During model training, the loss function is used to describe the performance of the model. When the value of the loss function is large, there is an obvious gap between the output value of the model and the actual value. At this time, the back propagation algorithm should be used to adjust the model parameters and continue training the model. When the loss function value is small and the iteration termination condition is met, it indicates that the model has reached the best state. At this point, the model is the best classification recognition model, and the model is saved and output. Figure 4 shows the training process of the optimized CNN model.

3.2.3. Construction of English Education Resource Library for Arduino Devices
Arduino is easy to learn and can adapt products to the environment by connecting different sensors in different environments, so as to immerse learners in the world of learning [17]. At present, Arduino learning resources are abundant. To achieve targeted learning and training for learners, English education video resources are collected and sorted out from the Internet. Thus an English education resource library for Arduino devices is built. When CNN successfully identifies Arduino devices, the Arduino recognition program automatically pushes links of relevant English education resources in the resource library to learners. Learners can click the links to open and view corresponding English learning resources in the browser.
4. Simulation Experiment
4.1. Construction of Experimental Environment
This experiment is carried out on the Windows7 operating system. Tensorflow deep learning framework is used to construct the CNN classification recognition model, and Python language is used to design the model.
4.2. Data Sources and Preprocessing
In this experiment, 10 English education resource libraries of Arduino device images collected on-site are selected as experimental data, including Arduino UNO board, L298 N drive board, Hall sensor, active buzzer, rocker, serial wireless transparent transmission module, PIR human body sensor, potential device, and ultrasonic module and LCD [18, 19]. Each device collects 500 images, and a total of 5000 images are collected.
CNN is adopted to learn and extract data features, but the number of images collected in this experiment is still small. In order to expand sample data and improve the applicability of the model, the experimental data set is expanded. Firstly, more sample images are obtained by searching similar images on the Internet, and then the sample images are further expanded by converting image angles and rotating images. Through the above-mentioned processing, there are 8200 sample images obtained.
4.3. Evaluation Indicators
Generally, there are accuracy, precision, recall, and F1 values selected as evaluation indicators, which are used to evaluate the performance of classification models. In this experiment, only accuracy is selected as an indicator to evaluate the classification performance of the model. The calculation methods are as follows [20]:
Here, and represent true positive and true negative; and represent all positive and negative; and and represent false positive and false negative.
4.4. Parameter Settings
CNN parameters include the number of the network layer, iterations, batch size, and learning rate, whose setting has a great influence on the recognition accuracy of the CNN model [21, 22]. In order to ensure the recognition accuracy of the model as far as possible, the experiment determines the values of the above parameters through debugging.
4.4.1. Number of Network Layers
The more network layers of CNN, the more complex the model structure and the more extracted feature information. However, too many network layers will greatly increase the model training time, resulting in overfitting problems [23]. The fewer network layers, the simpler the model structure and the shorter the model training time. However, too few network layers will make the number of extracted features limited, and the error function is prone to nonconvergence and falls into local optimization, resulting in low accuracy of final model recognition [24, 25]. To determine the number of network layers of the model, there are 5,6,7,8, 9, and 10 layers set up for model training, and the results are shown in Table 1. When the CNN has 9 layers, the recognition accuracy is the highest. When the number of network layers is less than or greater than 9, the model cannot achieve the ideal effect. Therefore, the network layer of the CNN model in this experiment is set as 9 layers.
4.4.2. Number of Iterations
Table 2 shows the accuracy of the model with different iterations. When the number of iterations is 6000, the accuracy of the model is the highest, reaching 86.32%. The accuracy of the model greater than or less than 6000 times decreases. Thus when the number of iterations increases, the recognition accuracy of the model firstly increases to a critical value and then decreases, and the critical value in this experiment is 6000. Therefore, the iterations of the CNN model in this experiment is set to 6000.
4.4.3. Batch Size
Table 3 shows the recognition accuracy of the model with different batch sizes. With the increase of batch size, the recognition accuracy of the model increases gradually, and the increased range of model accuracy increases first and then decreases. When batch size reaches 100, the accuracy of the model does not increase significantly and tends to be stable. Therefore, the batch size of the CNN model is set to 100 in this experiment.
4.4.4. Learning Rate
If the learning rate is too high, the model will not be converged. However, if the learning rate is too low, the optimization speed will be reduced. The accuracy of the observation model with different learning rates is set, and the results are shown in Table 4. Here, when the learning rate is 0.01, the accuracy of the CNN model is the highest, which is 72.36%. When the learning rate is less than or greater than 0.01, the accuracy reduces greatly. Therefore, the learning rate of the CNN model is set as 0.01 in this experiment.
Through the above experiments, the parameters of the optimized CNN model are finally set as follows: 9-layer network structure, 6000 iterations, batch size of 100, and learning rate of 0.01.
4.5. Experimental Results
4.5.1. Model Verification
To verify the optimization effect of the proposed method on CNN structure, the recognition accuracy of the model before and after optimization for the English education resource library of the Arduino device is compared. Table 5 shows the recognition results of the model before CNN structure optimization, and Table 6 shows the recognition results of the model after CNN structure optimization. As can be seen from Table 5, the recognition accuracy of CNN structure before optimization on different device images is high and the recognition accuracy of larger devices reaches more than 90%, such as the Arduino UNO board and the L298 N driver board. The recognition accuracy of smaller devices is also more than 80%, such as the active buzzer. Therefore, the recognition method of Arduino devices based on CNN has certain effectiveness and can accurately identify Arduino device images in different scenarios, which is conducive to reasonable optimization and manage English education resources. As can be seen from Table 6, the overall recognition accuracy of the model after CNN structure optimization for Arduino device images is 90% or more, and the recognition accuracy of the Arduino UNO board, LCD, and PIR human induction sensor is more than 95%. Compared with the model before optimization, the optimized CNN model has a better recognition effect. Therefore, the optimization of the CNN model is effective, and the recognition accuracy of the model can be improved.
4.5.2. Method Verification
To verify the application effect of the proposed method in the actual English teaching environment, a bow alarm based on the proposed method is constructed for English teaching activities. The alarm will sound when learners' head is down or close to the desk. At the beginning of the experiment, there are 30 students scanning Arduino devices through a mobile phone mini program and the mini program recommends English learning resources to students according to the scanning results for them to learn by themselves. Table 7 shows the usage of the recognition program. As can be seen, the recognition program of the proposed method also has a high recognition accuracy in the real English teaching scenarios, which is more than 80%. Compared with the laboratory test environment, the success rate of recognition is lower, but it can still meet the design requirements and has certain practicability.
5. Conclusion
In conclusion, the number of network layers of the CNN classification model is 9. Among them, there is only 1 input layer. In addition, the number of convolution layers and pooling layers both are 3, and the number of full connection layers is 2. On this basis, the proposed optimization and allocation method of English teaching resources can realize the recognition of Arduino devices and recommendation of English teaching resources. Moreover, the interaction way between English learners and English teaching resources is innovated, which promotes the optimization and management of English teaching resources. As can be seen that the recognition accuracy of the proposed method is more than 90% for Arduino devices, and the recognition accuracy is more than 80% for real English teaching scenarios, which has certain practical application value and is helpful for the optimization and allocation management of English teaching resources. Thus a new idea has emerged to integrate English teaching resources. This paper takes deep learning as the main line and the optimal allocation of teaching resources as the carrier and applies it to the field of teaching, so as to provide a new reference way for the informatization of teaching points.
Data Availability
The experimental data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest regarding this work.