In recent years, video production has become one of the key challenges for researchers in the area. In the face of the poor video effects of earlier video production methods and approaches, this study has considered the MOOC-based microclass combined with the MOOC video production method for improving the video production effect. According to the idea of MOOC, based on the early design stage and shooting preparation stage of video development, the video development team prepares the text, manuscript, and course type according to the initial design stage of video development and formulates the recording process. In the video recording stage, all activities of classroom teaching are recorded through the multislot mode for enhancing the sense of scene of the video picture. In the later stage of video production, video editing and feature processing are implemented by means of film and television to improve the effect of video production. The video production method based on the Hidden Markov model is used to ensure the continuity of front and back shots and complete online open video production and release to the cloud. The experimental results of the study show that the application effect of this method is effective, and the average satisfaction is as high and measured as 91.36%.

1. Introduction

At present, driven by the concept of global open education resource sharing, massive open online courses (MOOC) are developing rapidly. As a new online teaching mode, it has attracted great attention of scholars at home and abroad. It pays attention to the careful design and development of teaching resources. In particular, the MOOC video course developed with a variety of design, thinking, and expression techniques subverts the traditional classroom teaching video, constructs a teaching form of face-to-face conversation between teachers and online learners, and shortens the spatial distance between teachers and distance learners [1]. From the traditional distance education to today’s online open education, knowledge dissemination is inseparable from the carrier media. Compared with monotonous documents and audio, video, which has both image and sound, can more intuitively and vividly interpret the teaching content and promote the wave of teaching video construction at home and abroad. In recent years, with the rapid rise of video open courses in well-known foreign universities in the world, Ted, Khan College, and other institutions have launched their own video lectures and teaching videos, which are loved and sought after by many online learners. At the same time, online education platforms such as OpenCourseWare Alliance, Udacity, Coursera, and edX have also developed a large number of unique video courses. It redesigns the course itself, divides a complete course into several short videos ranging from 5 to 15 minutes according to the knowledge points, and designs them jointly by teachers and professional teams to make the traditional classroom into an online classroom for online learners, which once again sets off a new upsurge in the construction of online video courses abroad. Under this influence, China has also launched a large-scale construction of online video courses, with the emergence of online video courses based on excellent resource sharing courses and Video Open Courses of Chinese University [2, 3]. However, looking at the existing video course resources at home and abroad, it is not difficult to find that compared with the foreign MOOC, which is a learner-centered concept of video course resource development, there is a big gap in the performance and design thinking of online video course in China. The development of the video courses still remains in the classroom record level of face-to-face teaching. From the perspective of video quality, the picture quality and sound effect of online video courses are uneven, and the duration of video courses is generally about 30–50 minutes, which is not conducive to the long-term visual attention and knowledge memory of online learners, and it is not suitable for learners to use mobile terminals to implement fragmented learning; from the perspective of the video expression form, the phenomenon of “big head picture” of the lecturer in the picture is relatively serious [4], the screen visual attention guidance is not enough, the design of information is not in place, the content of the course is repetitive and lengthy, and the main teacher’s performance is not strong. The main reason is that we lack careful design of course content, image design of teachers, and visual packaging of screen information in the process of network video course development.

Microclass refers to a video and audio section of the targeted explanation on a knowledge point within five minutes by using multimedia technology according to the idea of teaching design. These knowledge points can be the explanation of teaching materials, the explanation of question types, the introduction of materials, and the induction of test points; they can also be the explanation and display of teaching experience and methods. Microlesson is a kind of video course designed for a certain subject, knowledge point, or teaching link. Because of this kind of targeted and single characteristic, the video production method of microlesson is very changeable. Only after the video producers and teachers fully communicate and understand the teaching needs, can they use the appropriate shooting method to produce effective teaching videos (of course, if teachers make videos themselves, there will not be such a problem). Generally speaking, the length of teaching video of microclasses is not more than 10 minutes, and there may be mass production demand for different topics, so “electronic green board” and “EFP multi-machine shooting” are very suitable for making teaching videos of microclasses. The operation of “electronic green board” is simple, and teachers can work independently. After all, it is the teacher who knows the students best. Only teachers know how to play in front of the camera, so that students can effectively absorb it. Therefore, the advantages of “electronic green board” can be shown here; “EFP multi-position shooting” is suitable for real scene shooting, such as experiments and sports courses, which cannot be replaced by virtual objects, and multicamera multiangle shooting, will not miss any detail, can be a perfect picture of specific knowledge, in line with the microclass’s targeted, single characteristics. Of course, although the microclass is simple, it does not mean that it is a crude video. If we want to produce excellent microclass courses, “virtual studio” can also be an option, but we need to spend more time on scripts and subscenes.

As for the production of teaching video, the first important idea to establish is that the school is no better than the TV station, and it is far inferior to the professional TV station in shooting specialty, budget, production manpower, and other aspects. If the video producers blindly ask the school to provide assistance beyond the campus hardware or teachers’ professional degree or the school tries to catch up with the professional equipment of the TV station, it will not produce good results. Therefore, recognizing each other’s needs and understanding each other’s differences is a key point for film and television producers and school units to make teaching videos. It is also a basic concept for professional film and television circles and education circles to run in with each other. With this level of cognition, it will not simply pursue the level of hardware and equipment, but fail to produce teaching videos corresponding to school teaching needs. Compared with the traditional video, the traditional video class mainly depends on the teacher’s explanation, which is difficult to modify, and the resources are fixed and closed. Microclasses and MOOC are rich in resources, easy to search, easy to spread, and widely used.

Video cutting is essential in video production. Video cutting is a video editing method to cut the video material and simulate multiview effects. It aims to generate multiview video from limited views. Fixed position camera images usually cannot provide enough visual interest or guidance for the viewer. Kong et al. applied virtual photography to a fixed lens speech scenes, according to the photography rules, used computer vision and signal processing methods to select the appropriate lens, used the image synthesis method to generate new images from the original material, and formed a variety of different lenses and visual effects, finally to guide the audience’s attention and maintain the audience’s visual interest. Later, scholars extended this method to live broadcast and other activities [5]. Ren et al. studied the automatic editing method of automatically generating multiview video from single-view video and automatically generating multishot video from single-view video. Their main idea is to capture the whole field of view of the event with a single fixed camera and simulate the camera’s translation, tilt, and zoom movement by clipping and zooming the original video material. In this method, L1 regularization optimization was used to calculate the composition of each clip, and the editing practice was encoded as a single cost function. Finally, the editing problem was transformed into an optimization problem. In addition, there is eye movement-based clipping [6]. Cabo et al. studied a method to find the clipping position and calculate the clipping size by collecting the eye data of the viewer, which can also achieve the effect of camera motion such as translation, tilt, zoom, and so on [7]. Zhu and Zhou further optimized the cutting window moving path algorithm to obtain better results. This kind of automatic video editing requires very strict material, which must be recorded by wide-angle static cameras. In a sense, cutting and editing is through the completion of the material in exchange for camera motion and lens switching simulation effects [8].

Therefore, to improve the effect of video production, microclass combined with the MOOC video production method is considered for reconstructing teachers from the traditional classroom into the course design thinking of network classroom and traditional classroom video to adapt to the fragmented learning needs of the mobile Internet era of shooting technology, which has a practical significance on how to develop high-quality video courses and solve its existing problems. The proposed study can help researchers to devise a new solutions in the area of research.

The study is organized as follows: Section 2 shows the MOOC-based video production method of microclass integrated with MOOC. Section 3 represents results of the study. The study is concluded in Section 4.

2. MOOC-Based Video Production Method of Microclass Combined with MOOC

The following subsections show the details of this section.

2.1. MOOC-Based Video Development of the Microclass Combined with MOOC
2.1.1. General Process of Video Development

In recent years, with the development of multimedia technology, the development methods and means of video teaching resources present multiple styles, and the video compilation process is also more simplified. People pay more attention to the use of video course shooting technology and despise video production. As a result, most videos on the current network platform have many shortcomings, and the video course development process is increasingly confused. As a network video course evolved and developed from traditional educational TV, its development process can be based on the production mode of educational TV and form its own development mode [9]. The development of MOOC-based microclasses combined with MOOC video focuses on learner-centered design. Therefore, video design is particularly important. The former design conforms to the learners’ teaching content, while the latter design conforms to the teaching content, video form, and diversified recording schemes and production means make the video form richer. Therefore, this study constructs the development process of MOOC-based microclass combined with MOOC video, as shown in Figure 1. The figure shows generic flowchart of video development containing different steps.

MOOC-based microclass combined with MOOC video development includes the preliminary design stage, shooting preparation stage, recording stage, and postproduction stage to complete the process of video production and release to the cloud. Among them, the video postproduction stage is crucial.

2.1.2. Recording Process of Video Development

(1) Shooting. The video recording process is an important work before video shooting. A relatively complete process of professional video course recording can improve the video recording level and work efficiency [10]. The video development team should prepare the script and course type according to the preliminary design stage and formulate the recording process. Based on microclass and the MOOC’s video production method based on MOOC, the production of excellent resource sharing class is mainly based on shooting. The general professional level video recording process is mainly composed of the camera, switchers, audio, teacher, computer, display screen, and nonlinear editing (as shown in Figure 2). The camera, tuning system, and teacher computer are connected to the switcher at the same time. Then, the recorded video is imported into the nonlinear editing system by switching the pilot system to implement the postediting work of the video. At the same time, the teacher’s computer is connected to the display screen to present the content picture on the computer screen. The video recording process is shown in Figure 2 which consists of various phases.

In the process of video recording, the installation of several cameras is the key link. According to the different types of video courses, a single camera, two cameras, or three cameras can be used to shoot the course. Stand-alone shooting: when using stand-alone shooting, the camera lens should shoot the teacher’s lecture activities in front and record the whole process of the teacher’s lecture [11]. The lens scene can use medium range or close range to highlight the teacher’s main image. This shooting method can create a picture feeling that the teacher is giving a lesson to the viewer instead of the viewer watching the teacher giving a lesson to others. However, the disadvantages of single shot are single scene, easy to create long-term viewer, visual fatigue, lack of classroom teaching, atmosphere of interaction between teachers and students, and single scenes in the later editing process, such as the same scene group will appear frame skipping reality, which is more difficult. Two-camera shooting: when two-cameras shooting is adopted, a panoramic view can be taken from the front of No.1 camera, close-up views of teachers’ lectures or students’ listening and answering questions can be taken from No.2 camera, the blackboard and display screens can also be taken, and scenes switching can reasonably use panoramic view, medium view, or close-up view and other lens language according to the teaching plot [12]. This kind of shooting method can record all activities of classroom teaching completely. The video picture has a strong sense, and the scenes are rich. The smoothness of lens assembly is good, which is more in line with the process of human visual thinking activities. However, attention should be paid to the lens assembly of the two machine positions to avoid the phenomenon of cross-axis. Shooting with three cameras: when shooting with three cameras, it is generally according to the panoramic position in the middle of the classroom, with two sides on each side. In one machine setting, this shooting method can fully record all aspects of the whole teaching process, with rich scenes, large amount of information, high flexibility, but large investment. The development conditions of the online video courses are limited. Without switching the platform, a camera can be used to capture the content of the display screen, and the content of the corresponding period can be inserted according to the teaching rhythm in the later editing process. The teacher’s presentation can also be a directly generated video or image by the software and edited to the corresponding time node to form a complete video in line with the learning logic courses.

(2) Screen Recording. The video production process of screen recording is simple, and the personnel input is small. Generally, a computer screen recording software, external drawing board, and cameras (including computer integrated camera) can be competent for the production of screen recording video courses. Video recording courses can be divided into simple slide and slide + teacher images. Simple slide type video courses only have slide pictures and the voice of teachers, but not the image of teachers and students [13]. Teachers can use the sketchpad function to write the drawing or detailed derivation process in PPT courseware and then guide students to pay attention and think just like the video course production form of Khan Academy. Slide + teacher image video generally needs to connect a camera or computer-integrated camera on the computer, and teachers record slides through a screen recording software, but also take their own images. In the postproduction, according to the needs of the teacher image and slide implementation synthesis, recording slide video is easy to make students feel boring, and occasionally, a teacher image can attract students to pay more attention to learning to improve learning efficiency.

2.1.3. Video Editing

The postediting of a video course is to import video materials into a nonediting software and use film and television techniques to implement editing and feature processing to improve the effect of video production. The video editing process is shown in Figure 3 which contains various steps.

In the figure, first, it needs to prepare the video and media materials, open the selected nonlinear video editing software, establish the project engineering file, and set the engineering parameters according to the video course production technical standards; second, it imports the materials into the video and audio tracks for editing and special effects processing [14]; during the editing process, the relevant media materials are inserted according to the content expression needs; after editing the first draft, the instructor and director should preview the video course, analyze and evaluate the video course, and edit the unsatisfied parts again until the desired effect is achieved.

2.2. Video Production Method of Shot Assembly Based on the Hidden Markov Model

Shot assembly refers to the logical connection of video images to describe the occurrence of something. In the existing work, automatic shot assembly is mostly organized according to the timeline of the event, which is used to express the story or documentary, and cannot edit the video material without an obvious time relationship [15]. This study compares the traditional shot assembly method based on natural timeline and proposes an artificial time line based on information arrangement logic, which logically connects the subshots to achieve the purpose of display. At the same time, the global similarity constraint is established to improve the richness of video information. In the process of shot combination, the coherence editing rules and industry experience are coded, and the hidden Markov model is established to ensure the coherence of the front and back shots.

2.2.1. Introduction of the Hidden Markov Model

The hidden Markov model is a probability graph model, which is the simplest dynamic Bayesian network. Its standard model includes a group of state variables and a group of observation variables [16]. It is usually assumed that the state variable is hidden and unobservable , where represents the state at time i, and observed variable is , where represents the observed value at time i. The state of time i is only determined by time and has nothing to do with other time . The observation variables of time i are only related to the state of this time and have nothing to do with the state and observation variables of other times. Each observation variable comes from the observation space , and each hidden state comes from the hidden space . The hidden Markov model is described by the -dimensional initial probability matrix , the -dimensional state transition matrix , and the -dimensional conditional probability defined by the output observation matrix . The definitions of the three probability matrices are as follows:State transition probability: the transition probability of the model among various states is usually recorded as matrix , where isOutput observation probability: the probability of each observation value obtained by the model according to the current state is usually recorded as the matrix , where isInitial probability: the probability of each state of the model at the initial time, usually is denoted as , where is

2.2.2. Editing Rule Coding

In this study, the shot assembly problem is modeled as a hidden Markov model. To solve the specific calculation problem of parameters in the model, this study adopts a coding method similar to dialogue-driven automatic editing and dance scene automatic editing and implements coding based on editing rules [17]. means that the state at time t is i. Each rule defines the initial probability, observation probability, and transition probability.(1)Avoid skipping: avoid the short-term interruption or change of the front and back shots during assembly. In this study, it is coded as the similarity problem between adjacent shots. When the front and back similarity of adjacent shots is higher than the given threshold , it is considered as skipping, and the corresponding state transition probability is set to 10−5. If it is not skipping, the probability is 1. It does not affect the observation and initial probability:(2)Picture motion coherence: the picture motion intensity at the end of the front shot and the beginning of the next shot should be as close as possible. In this study, the main body motion intensity matching is coded as the difference of the motion intensity value. The larger the difference, the lower the probability of the corresponding state transition and the higher the contrary. It does not affect the observation and initial probability:(3)Main body position coherence: when assembling, the main body positions at the end of the front shot and the beginning of the next shot should be as close as possible. In this study, the main body position matching is coded as the difference of the main body’s x-direction position value. The larger the difference, the lower the probability of corresponding state transition and the higher the contrary. It does not affect the observation and initial probability:(4)Picture tone coherence: the picture color temperature at the end of the front shot and the beginning of the next shot should be as close as possible. In this study, the picture color temperature matching is coded as the difference of the color temperature value. The larger the difference, the lower the probability of the corresponding state transition and the higher the contrary. It does not affect the observation and initial probability:

2.2.3. Layout Constraints of Lens Information

Because there is no obvious time relationship between the video materials, there is no natural timeline. It needs to manually set the timeline of the video arrangement. To maximize the information richness of the video, it is necessary to implement constraints on the logic of information arrangement, that is, the time line arrangement of subshots [18]. In this study, according to the habit of film and television lens assembly, the information layout logic mode is divided into four types: forward type, backward type, jumping type, and ring type. According to the distance of the lens, the distance of the sight distance is adjusted to switch the audience’s line of sight between the whole and the part, so that the audience’s visual experience can be affected accordingly.(1)Forward: the scope of the work is from large to small, and the lens is from far to near.(2)Step back: the scope of the work is from small to large, and the lens is from near to far.(3)Jump: the scope of the exhibition of the work from the largest directly to the smallest, and gradually increase the lens from the farthest to the nearest, gradually pulling away.(4)Ring type: the scope of work display is from large to small, then from small to large, and the lens is from far and near and far.

2.2.4. Sublens Assembly Modeling

The ultimate goal of video editing is to combine a certain length of subshot sequence from a group of candidate subshots, and the final length of the sequence can be specified by the user. From the total video time T and the average subshot time t, it is needed to obtain the video sequence composed of subshots. Assuming that the number of effective subshots in the raw material is N, then sequence spaces are obtained. Our task is to select the best sequence to meet the needs of effective information expression [19]. In this study, the group join problem is modeled as a hidden Markov model, and the sequence with the highest probability is obtained by the Viterbi algorithm.

(1). Assembly Model. In this study, video assembly is transformed into the problem of selecting the best subshot sequence from the subshot set, which is modeled as a hidden Markov model. The subshot set represents the hidden space, the shot of the ith node of the sequence represents the hidden state at time i, and the information type of the shot of the ith node of the sequence represents the observation state at time i. The initial probability matrix controls the selection of the initial shot, the transition probability matrix A controls the selection of the next shot when the current shot has been determined, and the output observation matrix B controls the selection of the information type.

The transition probability matrix A is determined by the coherent clips and “avoid skipping clip.” The initial probability matrix is determined by facial emotion constraints, and the observation matrix B is determined by the information layout constraint. Therefore, the video assembly problem is transformed into one of the three basic problems of the hidden Markov model. Given the model and the observation sequence , how to find the best matching state sequence with the observation sequence. In other words, how to train the model to best describe the observed data.

(2). Viterbi Algorithm Based on Global Constraints. In this study, the connection relationship of shots in video clips is modeled as a hidden Markov model, the information sequence is regarded as observation sequence, , constrains the information arrangement of shots, and the final shot sequence is regarded as a hidden sequence. Given the a transition probability matrix, output observation matrix B, and initial state matrix , the problem of group join is transformed into the problem of solving the optimal sequence because considering the problem of nonrepetition and difference of nodes [20], the Viterbi algorithm needs to be improved to solve the problem of global constraint optimization, and the global constraint optimization is applied to solve the problem.

The classical Viterbi algorithm uses the idea of dynamic programming to find the optimal path. In the process of solving, the probability of passing each node is recorded, and the maximum value is selected as the next node. Finally, the optimal path from the starting point to the end point is found through backtracking. is defined as the best path to the node when the hidden state is i at time t, that is, the maximum probability of all possible transition paths from node to node. is defined as the hidden state at time t, when the hidden state is i in the best path to the node, the hidden state at time is the hidden state of the previous node of the current node in the best path, and is defined as the hidden state of the final sequence at time t.

The state of each phase of the algorithm is as follows:

The dynamic programming recursive is carried out.

The maximum at time T is calculated, which is the most likely the hidden state of T.

Using local state to make backtracking:

From the maximum value of each node at the last time T, it can know that the hidden state at time T is i, and the transition path is the best path among all paths, which can determine the node at the end of the sequence. According to , the last node of the optimal path can be obtained, and a complete path can be obtained.

In this process, represents the hidden state of the node before the current node in the best path of node i at time t. The determination of the last hidden state is determined by traversing the product of the maximum probability of all nodes in the possible transition path at the previous time and the transition probability of the current node i. In all states, the one with the highest probability is selected, namely, . In this process, the previous hidden state selection does not consider all nodes before it, so it is necessary to add the constraint that the nodes are not repeatable and have certain differences. When selecting the last hidden state j, it needs to traverse all hidden states before time in the transition path. The hidden state before time cannot be j, and the picture cannot be too similar to the j state. If the condition is negative, it needs to discard j, select the suboptimal node k, and continue to traverse the path of node k to ensure that there is no repetition and difference between nodes until the condition is met. This can fully ensure the richness of video information.

3. Results

To test the effect of the method in this study, a high school’s teaching practice class is selected as the experimental object. There are 80 students in the class, including 35 girls. This class’s academic performance is average, and learning attitudes and classroom discipline are similar to other urban classes, more representative. The experiment of mechanics, as a part of the teaching content of physics in high school, has been mentioned in the previous new teaching. At that time, the teacher also made a classroom demonstration. Therefore, before playing the video produced by this method, it is necessary to arouse students’ memory, and the two teaching methods form a contrast. Then, through the form of a questionnaire survey, from the students’ attitude to this method of making video, this method is used to make the video learning effect, and two aspects of investigation and analysis are carried out.

80 questionnaires are sent out, and 75 valid questionnaires are collected. First, the survey data are collected, and then, the statistical data are processed by Excel. The survey results are analyzed as follows:

The students’ attitude towards this method of making a video is set as follows.Question 1: among the following teaching resources, you are most interested in the following situations: pictures, videos, and classroom demonstrationsQuestion 2: when you watch the classroom demonstration experiment and the experimental video, you think that the teacher’s demonstration experiment, experimental video, and uncertainty are more conducive to learningQuestion 3: are you willing to learn through the teaching video in the future? It can be divided into willing, unwilling, and uncertain. The experimental results are described in Figures 4(a)4(c).

The data show that among the three teaching resources of pictures, videos, and demonstration experiments, the demonstration experiment is the most popular among students, accounting for 67.1%, far ahead of the second video, while the pictures are the least popular. It shows that the students are eager to experience the experimental operation and witness the experimental phenomenon. However, this does not mean that the demonstration experiment is the most beneficial way to learn, which can be seen from the survey results of question 2; the number of students who choose the experimental video is almost the same as the number of students who choose the demonstration experiment. This is because although the demonstration experiment can increase the sense of on-site experience, it is time-consuming and inconvenient to observe the phenomenon. The students in the back row can hardly observe the phenomenon, while the experimental video is clear and time-saving. Although the experimental video itself has some shortcomings, most students are still full of expectations for it, and 63.2% of them are willing to learn through the video in the future.

Using this method to make a video learning effect, the following two questions are set.Question 4: does watching the video help you understand the mechanics: helpful, not helpful, and uncertainQuestion 5: compared with listening to the teacher’s explanation in the classroom, you can get the same learning effect by watching the “Mechanics” video after class: yes, no, and uncertain. The results are described in Figures 5(a) and 5(b).

Data show that by watching the video, as high as 85.2% of the students think that it is helpful to understand the broken key situation, while only 5% of the students think it is not helpful. For the fifth question, 49.8% of the students think that the effect of the self-study by watching video is the same as that of classroom listening, 24.1% think that the effect of classroom listening will be better, while other students have no agreement. This result is understandable. In the face of a new way of learning, some students will certainly show their maladjustment and be faithful to the way they have been used to. It is gratifying that nearly 50% of the students agree with it.

According to the questionnaire, the application effect investigated for the sound is more accurate and fluent, the video is clear and coherent, and the video content is short and vivid. The statistical results are given in Table 1.

From the above table, we can see that learners are generally very satisfied with the evaluation of the video produced this time. The average satisfaction is 91.36% in terms of more accurate and fluent voice, clear and coherent video, short and vivid video content, and other options, which show that the application effect of the proposed method is better.

4. Conclusion

At present, driven by the concept of global open education resource sharing, massive MOOC are developing rapidly with the passage of time. Current, in the background of continuing to promote large-scale open online curriculum movement, we still have many tasks to do and many problems to solve in the digitization and automation of educational resources. The development of network video is only the tip of the iceberg in the digitization of educational resources. However, it is a more effective and efficient medium and approach to spread teaching contents in the new information technology education environment. The development of an online video course is a complex, professional, and high investment work which requires large resources. From the early classroom video to the excellent course video open class and now to the MOOC video and microclass video, people have never stopped exploring how to develop the high-quality video course under the network learning environment. The traditional video production methods are not achieving the requirements of today’s era, and the video production approach is backward. This study focusses on the MOOC-based video production approach of microclass integrated with MOOC to improve the effect of video production for the best utilization. The experimental results of the study show that the video produced by this method has good and efficient results and a good application effect. The study will help researchers to devise novel solutions in the area of research.

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest.