Abstract

With the rise of piano teaching in recent years, many people participated in the team of learning steel playing. However, expensive piano teaching fees and its unique one-to-one teaching model have caused piano education resources to be very short, so learning piano performance has become a very extravagant event. The factors affecting music performance are varying, and there are many types of their evaluation such as rhythm, expressiveness, music, and style grasp. The computer is used to simulate this evaluation process to essentially identify the mathematical relationship between factors affecting music performance and evaluation indicators. The use of computer multimedia software for piano teaching has become a feasible way to alleviate the contradiction. This paper discusses the implementation method of piano teaching software, the issues of computer piano teaching, the computer teaching as one-way knowledge, and the lack of interaction. The neural network (NN) model is used to evaluate the piano performance and simulate teachers to guide students through their exercise. The performance of the proposed system is tested for the piano music of “Ode to Joy,” which is different from the collection of NN training samples, and is delivered ten times by another piano teacher, student A (piano level 6), and student B (piano level 5).

1. Introduction

The research scope of this topic includes computer technology, music theory, and education and has a rigorous theoretical basis and research results in these fields. This topic requires first understanding the content related to piano teaching software in various fields and then comprehensively applying this topic of knowledge in various fields. The aim of this topic is to develop NN based model by using electronic music technology mainly in music theory in addition to music, rhythm, and adjustment in education including learning and curriculum design.

Multimedia technology is not a new concept, since 1983 MIDI technical specification, multimedia music education with MIDI technology multimedia has been widely used. Recently, most music education organizations realized the digital piano teaching method. Digital piano teaching system is a solution to collective lectures in piano teaching. The system composition of digital piano teaching is to bring ten students with a piano and a teacher, equipped with auxiliary teaching software, multimedia projector, and monitoring system. In a digital piano lecture, the teacher first introduces the related knowledge, and then the students play. Teachers can listen to every student’s playing during their exercises and guide them separately. The digital piano teaching system has its own advantages, such as using collective lectures to improve teaching efficiency, in addition to mutual promotion between students, which plays a positive role in teaching content. The digital piano teaching system and the learning system discussed herein have similar places, and electronic technology is used for piano teaching. The key part is the implementation of teaching software. The difference is that digital piano teaching is a way of evaluating guidance during students’ exercises, and this system is evaluated by a computer.

In addition, scholars put forward the visualization of music and conducted relevant research work in this area. Its idea is to present the music characteristics to the audience or players. The simplest application here is in the music playing software, but it just uses a simple graphic to display a high sound acoustic in different frequency bands. Music visualization is also used in music performances, that is, video responding to performance features, which can be implemented with feature extraction of the sound wave signal or through responding directly to the MIDI signal. The research of music visualization has a wide range of applications. “We cannot just listen to music, we can still watch music”; this sentence is very attractive.

In the functional design of the piano teaching software/model by using NN, music visualization is a big contribution to education software. Students can also see their performance while playing the piano, which makes students comprehensively grasp the information of the music they play. In the era of information processing tasks, machine learning algorithms and NN representation learning are extensively used. NN has the advantages of not strict data distribution requirements, nonlinear data processing methods, strong robes, and dynamics; has a strong theoretical basis; and is ideal for evaluation. NN is a mathematical model that computes prediction/output based on interconnected layers (input, hidden, and output layer). Each layer is linked via a so-called weight matrix () to the next layer. Further, each layer consists of different combinations of nodes, where each node gets a particular number of inputs and calculates an output. Each node in the final layers makes weighted addition based on received values from the input nodes. Further, the weighted addition is passed to some nonlinear activation functions such as Rectified Linear Unit (ReLU), Leaky ReLU, Sigmoid, Tan Hyperbolic (Tanh), and Softmax Activation Function to compute the final outputs.

According to the literature, foreign research on teaching quality evaluation started relatively early. In 1915, the first student questionnaire appeared [1]. So far, the research on teaching quality evaluation by Western scholars has a history of about a hundred years. At the beginning of the 20th century, many European and American countries began to explore the establishment of teaching evaluation systems, but at that stage, only a very small number of schools have a relatively formal teaching evaluation process for fair and effective evaluation of teacher teaching; until 20th century only in the 1950s was a formal teacher teaching evaluation procedure formulated, and the teaching quality evaluation system was not widely used in universities until the 1970s. Since then, the research on teacher teaching evaluation has developed rapidly [2]. At the end of the last century, there was a qualitative jump in research in this field. Western countries proposed their evaluation models and methods. The development process of teacher teaching evaluation research in various countries mainly includes the following journeys [3]: In the first journey (1927–1950), a formal procedure for teacher evaluation has not yet been formed, and student evaluation activities have only appeared in a very small number of universities. In the second journey (1950–1960), in the late 1950s, researchers developed a formal teaching evaluation system through in-depth research on teachers’ teaching behavior, and students’ teaching evaluation activities gradually increased in universities. In the process of teaching evaluation, teachers’ quality and personality characteristics began to receive attention. However, since the results of the evaluation at that time were mainly determined by the subjective impression of managers, teaching evaluation was not standardized and quantified at this stage. In the third journey (1960–1980), many colleges and universities established relatively systematic teaching evaluation systems. At the same time, a new wave of related theoretical research has also appeared. The biggest improvement is to integrate students’ learning results. They were introduced into the evaluation of teachers’ teaching effectiveness and proved to have an irreplaceable important role in the evaluation system. Therefore, this journey is considered to be the most rapid development of teaching evaluation [46].

In the fourth journey (from the 1980s to the present), teaching evaluation activities have become the routine work of most Western universities, and the focus of research on teaching evaluation has also shifted to the technical methods and influencing factors of evaluation. The biggest feature of the evaluation system at this stage is the use of performance indicators (education evaluation criteria) to specify the basic content of the evaluation activities, clearly indicate what to evaluate, and give evaluations such as excellent, good, passing, or failing according to the actual situation grade [710].

After nearly a hundred years of development, each university in Western countries has a relatively complete teaching evaluation system, and the evaluation activities carried out are also very stable. However, due to a major difference between our country’s education system and Western education system, domestic colleges and universities must find evaluation models and evaluation methods that are suitable for their own schools based on their own characteristics and are combined with the background of the times [1113].

In China, the research on teaching evaluation activities started relatively late. In 1984, the earliest activity regarding the evaluation of teachers’ teaching quality by Chinese universities was [14] organized by Beijing Normal University, and its purpose was to provide a reference for measuring teachers’ teaching conditions. In May 1985, for the first time, our country clearly proposed to evaluate education, when the country promulgated the “Decision of the Central Committee of the Communist Party of China on Education System Reform”; in June 1985, the first national education evaluation seminar, namely, “Special Symposium on Evaluation of Higher Engineering Education” was held in Heilongjiang. Since then, many domestic educators have begun to devote themselves to research on the theory and practice of teaching evaluation in colleges and universities. After continuous efforts, many effective results have been achieved. In October 1990, in order to provide a guarantee for the teaching evaluation system, the former National Education Commission promulgated the “Interim Regulations on Education Evaluation for Regular Higher Education Institutions” [15], marking that our country teaching evaluation activities have entered the formalization.

Prolog language and music-assisted learning system based on the ARM and SA algorithm are proposed in the literature [16]. The authors deeply studied the principle and operation of music automatic recording technology using artificial intelligence to find out the implementation principle and law of piano automatic recording system. Similarly, [17] has also widely used Internet technology, intelligent interaction, and artificial intelligence as key factors affecting the development of art systems, such as piano playing.

Inspired by [1822], the contribution of this paper is represented below:(i)This work introduces the foreign product and its shortcomings in the theoretical teaching of piano and proposes the steps and frameworks required to use piano software.(ii)This paper proposes a music evaluation system based on the neural network model for counseling student playing practice. The system is also the core part of this article(iii)We design a neural network model based on the evaluation index of music performance with the help of piano teachers and students.(iv)We have obtained piano performance sample data, completed the training of the neural network, and evaluated the piano teaching process.(v)We avoid using an audio waveform file and use MIDI technology to extract music characteristics.

3. Material and Method

3.1. Artificial Neural Network

Artificial neurons are the mathematical models designed by the function of simulating biological neurons, which not only requires simple easing but also requires the basic characteristics of biological neurons. In Figure 1, the input and output model of human neurons is represented.

Artificial neurons can accept a set of input signals from other neurons in the system, each input corresponds to one right output, and weighted input sum determines the activation state of the neuron, equivalent to the “connection strength” of biological neurons. N inputs are represented by X1, X2, …, Xn, feature vectors which correspond to the W1, W2, …, Wn, coupling weights vectors. The input vector X and connection vector W use to indicate the cumulative effect of the input signal obtained by the neuron that is represented in

Neurons after obtaining the network input should give appropriate output. According to the characteristics of biological neurons, each neuron has a threshold, which is in an excitation state when the cumulative effect of the input signal obtained by the neuron exceeds the value; otherwise, it is in a suppression state. For artificial neurons, this is a transfer function, represented by : where is the output of neurons. Typical transfer functions have four types: linear functions, nonlinear ramp functions, step functions, and S-type functions, which are represented in the following equations:Linear function:Nonlinear slope function:Step function:S-type function:

Transfer functional selection needs to be determined according to different application scopes of the NN model, linear functions act as appropriate linear amplification of network input obtained by neurons, and nonlinear ramp functions are used to prevent linear functions from reducing network performance. The improved function is applied, the most widely used S-shaped function is applied, the generally hidden layer adopts the S-type function, and the output layer adopts linear function.

3.2. Forward Propagation of Neural Networks

Suppose the input layer of the neural network has nodes, the hidden layer has nodes, the output layer has nodes, the weight between the input layer and the hidden layer is , and the weight between the hidden layer and the output layer is . The transfer function of the hidden layer is , the transfer function of the output layer is , and the output of the hidden layer node is represented in

The output of the output layer node is represented in

3.3. Rhythm Feature Extraction

The rhythm feature is extracted corresponding to each bar of the music score. According to the division of music subsections in the score, each subsection of the performance can be accurately located. According to the positioning of the performance time, the rhythm and beat of the current bar can be extracted. Rhythm is the relationship between the length of a certain note, that is, the time point, and length of the pronunciation of each note; rhythm in the current bar should be extracted, and the player’s degree of grasp of the small rhythm should be comprehensively obtained. As shown in the figure, it is a score with four quarter notes as one measure.

The extraction of rhythm features can be divided into the following parts:

Align with the first note at the beginning of the bar, and record the time points when the player presses and releases the four notes. According to the characteristics of the rhythm, the subsequent notes have a certain dependence on the pronunciation length of the previous note. It can also be understood that if the previous note is not processed properly, the following notes will also be followed by errors. The evaluation of the degree of mastery of rhythm needs giving different weights to different notes.

Calculate the difference between the pronunciation point of each note and the standard value, and multiply it by the weight of the note to get the degree of grasp of the rhythm of the music in this section, which is represented in

Among them, is the sequence number of the notes in this measure, is the time when the player presses the key, is the standard time when the player releases the key, is the standard release time, and is the weight corresponding to different notes.

3.4. Beat Feature Extraction

The tempo feature is the quantification of the volume of the notes in each measure. Like the rhythm feature, the strength of the first note of each bar is particularly important. The same method uses different weights for different notes to extract the beat feature of this bar, which is represented by

Among them, is the number of the notes in the bar, is the volume of the player's key press, is the standard key volume, is the weighted value, and the chord volume is converted to .

4. Experiments and Discussion

4.1. Determination of the Parameters

First of all, from the perspective of manually evaluating music playback, when the elements of music playing are played, the evaluation process includes performances, players, musical instruments, listeners, and audience’s feelings of music. The impact of playing exchanges on the audience is essential for processing the sound sent by the instrument, such as reflection and reverb, and the impact generated by the instrument is the sound of it. The operator’s operation of the instrument can make the instrument produce different sounds, and this process is also the most important process that affects the audience. The evaluation system of undergraduate research is aimed at evaluating the performance of the player, so his/her process of operation of the instrument is the input we have to get.

Music is composed of a sound emitted by the instrument, and the role of the player is essential in the sound of the instrument. This topic uses the MIDI instrument. The MIDI message issued while playing is the input we need. It can be used as an input to the NN system after quantifying the MIDI signal of each sound. However, the number of sounds in music is much larger; according to the theoretical input layer of NN, each neuron corresponds to a parameter of an input system, and then the size of the NN will be very large, which will directly affect the efficiency of NN. Even if there is a failure of NN design, the features of each sound are not feasible as input parameters. These single sounds are available from factors affecting performance. Factors affecting performance by the music theory should also summarize factors such as rhythm, beat, chord, and melody adjustability, so these factors should be used as input parameters of NN evaluation models.

4.1.1. Parameter Determination of Pitch Characteristics

The characteristics of the sound include “high,” “strong,” and “long,” in addition to tone. Ignore the message from the MIDI instrument to resolve other features. The previously referred to the input parameters that cannot be used as the system, according to the music theory. We synthesize the characteristics of all tones in the subsection as a system and input parameters of the system. Among them, the integrated strength and long feature are, respectively, corresponding to the section and rhythm, which only needs to be synthesized by the characteristics of the sound.

The high attribute value obtained from the MIDI message is 127. The difference in each stage and the two keys adjacent to the piano keyboard (not distinguishing the black and white bond) is half sound. If the pitch obtained when playing is different from the standard, this means that the key is wrong when playing; it is a button error. The impact of the button error on music is the most serious, regardless of how the difference is determined as an error. The integrated factor of each section is only quantified by the number of miscible numbers, that is, the number of key errors in this measure divided by the total number of notes in this measure.

The first small section of the music “Ode to Joy” has a total of four tones, and the numbers of B, B, C, D, and MIMD messages are 71, 71, 72, and 74 for the pitch, respectively. When you play, you will play four sounds, then the input parameter is 4/4 = 1; if only 3 sounds pop up, the parameter is 3/4 = 0.75 and it is pushed. If only 2 sounds pop up, the parameter is 2/4 = 0.5; if only 1 sound pops up, the parameter is 1/4 = 0.25. The entire spectrum has a total of 16 pieces corresponding to 16 input parameters; that is, the input parameters of the pitch feature require 16 input layers of neurons.

4.1.2. Parameter Determination of Rhythm Characteristics

Rhythm is characterized by the length of time. The time of the spectrum is relatively, and it needs to be converted to an absolute time to obtain a difference in the length of the pitch. The player is not the speed of each performance when playing is equal, and there is a difference. However, this difference does not affect the performance of music, so the standard time is not absolute, and it is necessary to adjust it according to the absolute time of the player. For example, “Ode to Joy” has a total of 16 sections, and the absolute time of each section performance is two seconds. The length of this music is 32 seconds. If the absolute time of this performance is 40 seconds, the absolute time adjustment of each section is required; that is, the absolute time of each section is , which is 2.5 seconds. At the same time, each note must be adjusted in the corresponding section.

Take the fourth section of “Ode to Joy” as an example, a total of 2 messages, the first time of the first point notes should be 1.5 shots, the second is 0.5 shots, the third is the two-point note is two shots. The absolute time for each note pronunciation is 0 seconds, 0.75 seconds, and 1 second, respectively, and the duration is 0.75 seconds, 0.25 seconds, and 1 second, respectively. If the pronunciation time of the note is 0.1 seconds, 0.8 seconds, and 1.2 seconds, the duration is 0.7 seconds, 0.3 seconds, and 0.8 seconds, respectively, which is represented by the following equations:

The impact of duration on performance is much smaller so that the weight of 0.3 is multiplied by 0.3, and finally the input parameters grasp the rhythm of this section as . Similarly, the parameters of the rhythm also need to correspond to the neurons of 16 input layers.

4.1.3. Determination of the Beat Feature Parameters

The beat is a feature that describes the weakness. Specificity is very limited, and strength is also a relative value. We quantize the MIDI signal into 127 discrete values and divide them into strong and weak values so that the absolute value can be found. We believe that, in the music festival, players with very high performance level (such as piano teachers) have good input samples as the average value of input samples, and it is used as the standard value.

In the first segment of the Joy, the standard value of each sound made is 100, 70, 90, and 70 corresponding to strong, weak, strong, and weak. The values obtained when the player performs are 98, 75, 80, and 70, respectively. Then, the value of this section is on the value of the beat. The same corresponds to the 16th section, and the NN element of 16 input layers is required.

4.1.4. Parameter Determined by Chord

The chord is a tone consisting of a point in time. According to the score and standard MIDI files, each chord time point can be known in advance, and the arrival time of each chord time point during playback is judged. The pitch of each chord is calculated from the previously mentioned method, strong and weak, and the time is different from the standard value. Due to the difference of pitch, first judge the alignment of the pitch, if the fundamental tone is wrong, the number of chord judgment errors is 1, if the other has an audible judgment is 0.5. Finally, the number of judgment errors in all chords will be divided. The total number of chords is an input parameter that is as high as the string and corresponds to an input neuron. The strength of the chord, the strength of the time, and the determination tone use the same manner, and the sum of their differences is obtained as the input parameters of the chord strength and the time, respectively, corresponding to two input neurons.

4.2. Backward Propagation Determination of Neural Network Parameters

The primary problem applied to the music performance evaluation system is to design a network. Overall network design is a comprehensive problem that meets a variety of different requirements; for example, good promotion capabilities of the network design, easy implementation, and fast training, which is the most important. Network promotion (or generalization) capability refers to the ability to make a correct response to a sample that does not appear in the training concentration (but with the same law). Network promotion capacity is related to sample data, network structure, and network algorithms. Therefore, the information of the input parameters is fully excavated, improving the network structure, and improving the network algorithm can improve the network promotion capacity to some extent.

The choice of MSE (mean square error) is relatively reasonable during NN training.

In the standard BP algorithm, the error is defined by

Each sample can modify the weight matrix. Since the modification of each sub-right matrix does not consider whether the output error of other sample effects after the weight is modified, it will result increase in iteration.

The global error of the cumulative error BP algorithm is defined by

This algorithm is used to reduce the global error of the entire training set, not a particular sample, so if a certain modification makes the global error, this means that the error of each specific sample can be greatly reduced. It cannot be used to compare different network performances from P and M, because for the same network, the larger the p, the larger the P value, the larger the M, and the greater the E.

4.3. Training the Neural Network Model

After determining the input, NN models of the performance parameters, structure, and parameters of the NN model, we can train the NN model. First, the standard value of the acquisition characteristics is obtained by the MIDI file and the piano teacher, and then the input characteristics in piano are extracted through different levels. For the “Ode to Joy” piano, two piano teachers and three students conducted several technologies to obtain the training samples of NN models. After each playing, the input data of the NN model is obtained, and after manually evaluating the overall, rhythm, and expressiveness of the play, the data input range is between 0 and 1. This training has collected 10 training samples. Then, the learning rate η was set to 0.5, and the error E starts training for NNs from 0.001. After 8,000 training times, the error is less than 0.001, and the network converges.

4.4. Realization and Performance Analysis of Teaching Evaluation Function

The performance of the system was tested for the piano music of “Ode to Joy,” which is different from the collection of NN training samples, and was delivered ten times by another piano teacher, student A (piano level 6), and student B (piano level 5). Table 1 describes the final statistical evaluation results of piano teachers’ performance. Table 2 describes the final statistical evaluation results of student A’s performance. Table 3 describes the final statistical evaluation results of student B’s performance.

From the statistics in the above tables, the average value of the comprehensive evaluation of teachers is 0.9088, the average expressiveness is 0.8346, and the sense of rhythm is 0.8332. It can be seen that the system evaluation value is consistent with the expected value of the piano teacher training sample. The average comprehensive evaluation of student A is 0.7169, the average expressiveness is 0.6516, the average rhythm is 0.6694, and the system evaluation value is in line with the student’s playing level. The average comprehensive evaluation of student B is 0.606, the average expressiveness is 0.5852, the sense of rhythm is 0.598, and the system evaluation value is in line with the student’s playing level. It can be seen that student B’s comprehensive evaluation value is lower than that of student A, which is also in line with the actual situation of the students.

Figure 2 illustrates a comparison of the accuracy and comprehensive evaluation of piano playing by three persons. In Figure 3, the performance comparison of piano playing by three persons is represented. In Figure 4, the comparison of the rhythm of piano playing by three persons is represented.

According to the comparison charts, it can be seen that, except that the scores of individual performances do not meet the actual level of the performers, the output of the evaluation system can meet the requirements, which also shows that the NN model is feasible for the music evaluation system.

5. Conclusions

Artificial intelligence is used to simulate people’s activities, behaviors, and ideas, for completing more production activities that only people can complete to reduce people’s burden and increase production efficiency. Through the in-depth study of NN models, piano teaching methods, music theory, and MIDI technology, this paper proposes a framework of the music evaluation system and piano teaching software using an artificial NN model from the perspective of simulating piano teachers.

The main aims of this work can be summarized in the following points.

The status quo of domestic piano teaching is analyzed, cost is high, piano teachers are scarce, and people’s demand for learning piano is increasing. A method is proposed by using computer music to conduct piano teaching. The core part is the piano teaching software. Piano teaching software is used to simulate the teaching process of piano teachers. The teaching process of piano needs to complete the theoretical knowledge teaching and the counseling of students playing practice. This paper introduces the foreign product and its shortcomings in the theoretical teaching of the piano and proposes the steps and frameworks required to use piano software. This paper proposes a music evaluation system based on the NN model for counseling student playing practice. The system is also the core part of this article.

We avoid using an audio waveform file and use MIDI technology to extract music characteristics. According to the music theory foundation, the characteristics of the piano playing effect are found and converted from the MIDI signal to the input parameters of the NN.

In terms of piano teaching software, the characteristics of foreign two software applications have been described in detail, in addition to analyzing the characteristics and functions of piano teaching software in China and completing the frame design of the software. The system implementation of the piano playing practice includes the extraction of the MIDI signal, the construction of the NN model, the playing interface, the five-tier spectral animation, and the evaluation interface. All encoding work is completed on the Win32 platform.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest or personal relationships that could have appeared to influence the work reported in this paper.