There were a lot of psychological music experiments and models but there were few psychological rhythm experiments and models. There were a lot of physiological music experiments but there were few physiological music models. There were few physiological rhythm experiments but there was no physiological rhythm model. We proposed a physiological rhythm model to fill this gap. Twenty-two participants, 4 drum loops as stimuli, and electrocardiogram (ECG) were employed in this work. We designed an algorithm to map tempo, complexity, and energy into two heart rate variability (HRV) measures, the standard deviation of normal-to-normal heartbeats (SDNN) and the ratio of low- and high-frequency powers (LF/HF); these two measures form the physiological valence/arousal plane. There were four major findings. Initially, simple and loud rhythms enhanced arousal. Secondly, the removal of fast and loud rhythms decreased arousal. Thirdly, fast rhythms increased valence. Finally, the removal of fast and quiet rhythms increased valence. Our work extended the psychological model to the physiological model and deepened the musical model into the rhythmic model. Moreover, this model could be the rules of automatic music generating systems.

1. Introduction

The relation of music to emotion has been studied for decades and the literature is fruitful [1]. There exist a lot of psychological models between music and emotion [2], but the physiological models between music and emotion are limited [3]. One of the physiological actions, heart rate variability (HRV), which is controlled by the autonomic nervous system (ANS), is tightly connected with emotions [4]. Previously, we had analyzed the relationship between musical rhythms and HRV [5] and built two heuristic models [6, 7]. In this paper, a systematic algorithm is proposed to construct new models.

Musical emotions change with psychophysiological measures and musical features [8], whilst three basic questions are highlighted [9]: how do musical features evoke emotions; how do actions involved in musical emotions progress; and which actions and brain processes are involved in musical emotions. Basically, people feel what music expresses but need not be always; in a simple case, only 61% of 45 participants felt what they perceived [10]. More particularly, a stronger correlation is suggested by a recently developed theory that the aesthetic awe accompanies by being moved (cognitive), emotions (psychological), and thrills (physiological) in the same time [11]. The three associated levels of musical response should be analyzed individually and can be discussed together. In the research about music psychology, sometimes the perceived and felt emotions are examined separately; sometimes the music clips are just labeled artificially; whether they are perceived or felt had not been mentioned yet [12]. The mappings between music space and emotion space [13, 14] employ the following synonyms: music mood detection [15], music emotion measurement [12], characterization [12], recognition [16, 17], classification [12, 18], predicting [19], or modeling [20]; the review articles demonstrate the fruits of experts’ interests [1620].

In a valuable model, four elementary properties are goodness-of-fit [21] (generalization [22]), simplicity [21, 22], predictability [21, 22], and the relation to existing theories [22]. Beyond cognitive reactions, the prediction of emotional reactions is also important in music psychology [22, 23]. In addition to an existing review [24], a survey of the models [2545] for emotional responses is given in this paper. The acoustic features, emotion spaces, and methods to generate these models are listed in Table 1 with four major observations. Initially, regression analysis [30, 31, 36, 3840, 42, 43, 45] is widely employed, while some soft computing methods (fuzzy [41]; support vector machine [34, 37]; neural network [35]; K-nearest neighborhood [27]; Gaussian mixture model [25]) are also used. Secondly, beyond the enviable valence and arousal, tension [29, 32, 36, 40, 44] is another common dimension too. Thirdly, the felt emotion [28, 35] is infrequently examined compared to the perceived emotion. Finally, most works employ a wide range of acoustic features, while the interest in a single feature moves from pitch [40, 44] to timbre [28] and rhythm [26]. For the emotion space, Russell [13] proposed the valence/arousal space and Thayer [14] reduced it to four labels; the following synonyms are commonly used: valence as pleasantness [38]; arousal as activity [36, 45]; tension [29, 32, 36, 40, 44] as interest [43], expectancy [40], strength [38], potency [38], and resonance [27].

Whether perceived or felt emotion, the three commonly used methods, including valence/arousal dimensions, lists of basic emotions, and diverse emotion inventories, are not complete in the musical emotion study [46]. To have a more comprehensive overview [4], the underlying mechanisms of the central nervous system (CNS) [46, 47] and the ANS [4850] should be also deliberated. In the most essential sense, the acoustic properties can be perceived by the nervous system and evoke psychophysiological responses. For example, fast and loud voices usually cause emotional arousal and increased respiratory and heart rates through the auditory and limbic systems [46, 47]. Some related physiological reactions coincided with the emotion expressed in music [48] and rated valence/arousal level [49]. Therefore a valence/arousal model [50] of musical emotion had been built based on the associated physiological measures, including electromyogram, electrocardiogram, skin conductivity, and respiration changes. Extended linear discriminant analysis (pLDA) was employed in the classification of musical emotions.

Since emotion correlates with physiological responses [4850], too, the models of musical emotions were developed by integrating the acoustic features and physiological measures [5153]. Despite combining with physiological measures, these models [5153] tend to be similar to the psychological models mentioned above [2545] in spite of the same perceived or felt emotion space. In addition, using acoustic features to model [3] the physiological responses seems more radical if the underlying mechanisms [46, 47] are considered. The model [3] employs 11 musical characteristics and the conclusion is that rhythmic features are the major factors of the physiological responses (respiration, skin conductance, and heart rate) to music. It also points out a limitation that some acoustic features would correlate with each other. This makes it difficult to discriminate their relative contributions to the detected relationships.

The rest of this paper is organized as follows. In Section 2, the rhythmic features for modeling and the HRV features for the physiological emotion space are introduced. In Section 3, the experiment and developed algorithm are presented. In Section 4, the models of the relationships between the rhythmic and the HRV features are illustrated in figures, and the equations and tables of statistics are provided. In Section 5, how tempo, complexity, and energy work in the psychological experiments, psychological models, physiological experiments, and physiological models are reviewed and compared. Finally, in Section 6, we summarize the contributions made in this study and suggest the directions for the further research.

2. Preliminary

2.1. Why Are Physiological Models Necessary?

To study the relation of musical features to emotion, most psychological models focus on the perceived emotion, as Table 1 demonstrated. However, the perceived emotion need not be equal to the felt emotion [10, 54], and the felt emotion may not have the related physiological responses neither [55]. If the music clips are selected by the participants themselves, the perceived emotion has no statistical significance with the felt emotion in the ranks. Otherwise the differences are statistically significant [54]. Moreover, some patterns of the physiological responses appear while there is no related self-report [55]. Since the perceived emotion, felt emotion, and physiological actions play different roles, a physiological model is necessary beyond the psychological models.

2.2. Valence/Arousal Model

Although four dimensions are necessary for emotion spaces [56], it is difficult to realize a four-dimensional model. Three dimensions are also employed in the psychological models [27, 36, 38, 43]. Some work [57] announced that three dimensions could be reduced into two, without significant loss of goodness-of-fit; it also shows that the dimensional model is better than the discrete (categorical) model in resolution. Thus Russell’s two-dimensional valence/arousal model [13] states its merit, within the ebb and flow of relevant woks, in the discipline of psychology.

2.3. A Physiological Valence/Arousal Model

SDNN and LF/HF, two measures of HRV [58], are employed as two dimensions in our physiological valence/arousal model. SDNN is the standard deviation of normal-to-normal heartbeat intervals in time domain and LF/HF is the ratio of low- and high-frequency powers after the fast Fourier transform (FFT); SDNN presents the variation of the circulatory system and LF/HF presents the balance of the sympathetic and parasympathetic nervous systems [59]. HRV is highly correlated with emotion [6064] and some evidence [6570] reveals that SDNN is a good indicator of valence in the physiological perspective. In general cases, the normal subjects’ SDNNs are higher than the depression subjects’ [6567]. For the normal subjects’ case, the SDNN levels of subjects with positive mood are higher than the negative mood [68]. For the depression subjects’ case, the SDNN levels of low-depression patients’ are higher than the high-depression patients’ [69, 70]. All of these findings reach a statistical significance level. Hence SDNN could be a proper indicator as the physiological valence. In addition, increased LF/HF values denote that the sympathetic nerve activity tended to be strong; thus LF/HF could be used as a physiological indicator of arousal levels [59].

2.4. Musical Features Employed in the Model

The three most important features of music are rhythm, pitch, and timbre [7176]. Although pitch [77] and timbre [78] have been employed to recognize emotion in music, the role of rhythm seems more rudimentary. In fact, the rhythmic features [26] had been used in modeling the emotional responses and acquired a reasonable performance.

An all-encompassing representation of the rhythm may be the temporal organization of sound, tightly connected with meter, which refers to the periodic composition in music [23]. Musical rhythm may have its roots in the motor rhythms controlling heart rate, breathing, and locomotion [79], dominated by brain stem the ancient structure [47]; fast, loud sounds produce an increased activation of the brain [47] and some evidence suggests that the musical rhythm can regulate emotion and motivational states [80].

Rhythm is less tractable than pitch and timbre in localizing its specific neural substances [81]. However, many cultures emphasize the rhythm, with two others playing a less crucial role [82]. To study the neural basis of rhythm, brain imaging, psychophysical, and computational approaches have been employed [83]. Beat and meter induction are the fundamental elements of cognitive mechanisms [84], while the representations of metric structure and neural markers of expectation of the beat have been found in both electroencephalography (EEG) and magnetoencephalography (MEG) [85]. Beat perception is innate; newborn infants expect downbeats (onsets of rhythmic cycles), even unmarked by stress or the spectral features [86]. Infants also engage in the rhythmic movement to rhythmically regular sounds and the faster movement tempo is associated with the faster auditory tempo [87]. Although the beat perception is innate, the ability to detect rhythmic changes is more developed in adults than infants [88] and the trained musicians than the untrained individuals [89].

Of all rhythmic features, experts suggest that tempo, complexity (regularity), and energy (intensity, strength, dynamic, loudness, and volume) [15, 27, 90, 91] are the most significant.

3. Methods

3.1. Experiment
3.1.1. Participants

There were 22 healthy subjects, 15 males and 7 females, engaged in the experiment. The average age was 23. None of them had been professionally trained in music.

3.1.2. Musical Stimuli

There were four drum loops in this study (L1 to L4). The parameters (tempo, complexity, and energy) are listed in Table 2 and the rankings are illustrated in Figure 1.

3.1.3. Apparatus

The ECG signal was captured by a 3-channel portable device (MSI E3-80, FDA 510(k) K071085) at 500 Hz sampling rate from the chest surface of the body. Only the channel-1 data was taken to be analyzed.

3.1.4. Procedures

All experiments were carried out in moderate temperature, humidity, and light with subjects sitting and wearing headphones (eyes closed) in a quiet room. Each subject accepted 4 rounds of experiment in different days. Each round took 15 minutes, separated as 5 stages, as Figure 2 illustrated. Stage 1 had 5 minutes to let the subjects calm down. Stage 2 had 3 minutes of rest, as the baseline for the responses of the drum loops. Stage 3 had 2 minutes, as the baseline for the responses after the drum loops. Stage 4 had 3 minutes of stimulus of some drum loop. Stage 5 had 2 minutes of rest. The ECG signal from stage 2 to stage 5 was recorded and separated as epoch 1 (E1) to epoch 4 (E4). Comparison 1 (C1) was the difference of E1 and E3 and comparison 2 (C2) was the difference of E2 and E4.

3.2. Signal Processing
3.2.1. Musical Features

Musical rhythm is expressed by the successive notes that record the relating temporal information. Tempo () could be decided as the unit, beats per minute (bpm) [92].

The other characteristic, perceptual complexity (), was obtained by asking the human subjects to judge the complexity of the rhythms they had listened to. It was assessed by the subjects using a subjective rating of 1 to 4 on a Likert Scale (4 being the most complex) [93].

The energy parameter () was defined as , the summation of square of , where is the amplitude of the signal [94, 95].

3.2.2. HRV Features

To acquire the measures of HRV features, QRS detection was the first step [9698], where denotes the peak in a heartbeat signal. After the abnormal beats were rejected [99], the mean of R-R intervals (MRR), standard deviation of normal-to-normal R-R intervals (SDNN), and root of mean of sum of square of differences of adjacent R-R intervals (RMSSD) were measured in time domain [59]. After interpolation [100] (prepared for FFT) and detrending [101] (to filter the respiratory signal), the FFT [102] was applied to calculate the low- (LF) and high-frequency (HF) powers and their ratio (LF/HF). The results of SDNN and LF/HF are listed in Table 2. Four groups of data, SDNN C1, SDNN C2, LF/HF C1, and LF/HF C2, were observed in our analysis.

3.3. Algorithm

For modeling the responses of some HRV measure and the related musical rhythms, our algorithm included 3 steps. Initially, all possible combinations of the rhythmic features were explored. Secondly, the values of the combinations were linearly transformed. Finally, the coefficients of the linear transformations were calculated such that the Euclidian metric between the results of step 2 and the related HRV responses is the minimum.

The stimuli are 4 drum loops (rhythm), named as to here:

Each rhythm relates to some HRV measure :

For each rhythm, there are three rhythmic features, , , and , named as to :

Since a musical stimulus contains multiple combinations and interactions of various features, it is difficult to realize which feature is contributing to the perceived emotion [45], or physiological responses. Our solution is considering all possible combinations: the influence of each feature is linear (order 1), of no effect (order 0), or inverse (order −1). Although the higher orders are plausible, order one is still the most suitable to construct a simplified model for understanding the relation of the musical rhythms to their related HRV measures:

Linear transformation is necessary because the units of rhythmic and HRV features are not uniform. All we need are the related correspondences. Consider

Thus for some subject’s HRV responses to (e.g., SDNN) to rhythms to , some combination of rhythmic features can model the relation if the metric between and is the minimum. Euclidean metric [27, 103] is employed in this work:

After squaring each side of (6), we can acquire the coefficients of the linear transformation to get the minimum metric if the partial derivatives of (with respect to and ) are both zero:

3.4. Statistics
3.4.1. Outliers

The judgment and removal of outliers are fundamental in the processing of experimental data [104]. In our study, the experimental data was separated into three groups by the minimum metric mentioned in Section 3.3 and the group with larger metric was eliminated.

The basic idea is to rank a group of data by the algorithm we proposed and make the data with larger metric obsolete. There were four groups of data, SDNN (C1), SDNN (C2), LF/HF (C1), and LF/HF (C2). Each group had 22 records of the 22 participants. Each record had four subrecords of drum loops L1 to L4. First, we used the algorithm in each group and ranked the 22 records as 1 to 22 by their minimum metric. After summing the ranks of the four groups of these 22 participants, these participants were partitioned into three levels: small, medium, and large metric (7, 7, and 8 participants). We defined the predictable class with the small and medium metrics and the nonpredictable class with the large metric. The nonpredictable class was excluded from our experimental data. After removing the nonpredictable class, the rhythmic features and the four averages of these four groups of predictable class were modeled by our algorithm again.

3.4.2. ANOVA and -Test

Repeated measurements have four major advantages: obtaining the individual patterns of change, less subjects, serving as subjects’ own controls, and reliability; the disadvantages are the complication by the dependence among repeated observations and less control of the circumstances [105]. In our experiment, the repeated measurements were employed.

Two basic methods of ANOVA are one-way between-groups and one-way within-groups. A more complicated method is two-way factorial ANOVA with one between-groups and one within-groups factor [106]. The repeated-measures ANOVA can be considered as a special case of two-way ANOVA [107]. The formula of repeated measures ANOVA with one within-groups factor and one between-groups factor () is listed as [106]

For each HRV measure, there are two ANOVA tables. is the participant. If the epoch is fixed, is the rhythm. If the rhythm is fixed, is the epoch. Then the Tukey honest significant difference test was used to acquire pair-by-pair comparisons [108] for each between-groups pair.

Finally, the pairwise -tests [109, 110] were applied for C1 and C2 to realize whether the HRV responses are influenced by some particular rhythm.

4. Results

Table 2 collected the musical features, factors of models, and HRV data in this study. Valence C1 and arousal C1 reveal how rhythmic features influenced the HRV responses while listening to music, and valence C2 and arousal C2 reveal how rhythmic features influenced the HRV responses after listening to music. The values of modified combinations of the rhythmic and HRV features were illustrated in Figures 3(a) and 3(b). Furthermore, (9) to (12) demonstrated the relationships.

Fast tempo enhanced SDNN; that is, people prefer faster tempi:

High intensity and low complexity enhanced LF/HF, the physiological arousal:

People prefer faster tempi. However, the sound should not be too loud if we hope the effect can remain after the music stops: If fast and loud music stops, people feel relaxed:

Table 3 is the statistical results. The SDNN (C2) of the four rhythms had a significance value 0.01. The LF/HF of L3 had a significance value 0.03 while and after listening to music.

5. Discussion

5.1. Advantages of the Proposed Model

There are four major advantages in this work. Initially, the psychological music models are many [2, 2545], but the physiological music models are rare [3]. Secondly, the psychological music models [2, 2545] and physiological musical experiments [1, 111] are many, but studies of the psychological rhythm model [26] and the physiological rhythmic experiment [112] are rare. Thirdly, almost all studies concern the responses during the music, but discussions about the responses after the music are limited [113]. Finally, there are many music studies about HRV [1, 111], but there is no model. Our work is the first one.

5.2. Comparison of the Models

There are six levels of musical model illustrated in Figure 4: (a) the model from the results of the psychological experiments, (b) the model of the perceived emotion, (c) the model of the felt emotion, (d) the model from the physiological results, (e) the proposed physiological valence/arousal model while listening to the musical rhythms, and (f) the proposed physiological valence/arousal model after listening to the musical rhythms.

The six levels of model need not reach a consensus. There are four reasons. Initially, the perceived emotion is not always the same as the felt emotion [10, 54]. These two emotions are closer if the music clips are chosen by the participants themselves instead of others. Secondly, some measured physiological responses might not have been reported by self-report [55]. Thirdly, if the music clips are simplified to the form of rhythm or tempo, the responses are changed [112]. Finally, the responses while and after listening to music need not be the same obviously.

Figure 4(a) shows that and dominate arousal and dominates valence in a survey of psychological experiments (pp. 383–392) [1]. The effects of and on arousal are clear. As for valence, the regular and varied rhythms derive positive emotions and the irregular rhythms derive negative emotions (pp. 383–392) [1]. Hence valence is negatively correlated to .

Figure 4(b) shows that [26, 31, 45] and [26, 27, 31, 45] dominate arousal and [27] and [27] dominate valence in the perceived models. The results of arousal for the perceived models are the same as the psychological experiments. But and are viewed as the valence-based features. Whether they are positively correlated to valence or not has not been highlighted in the paper [27].

Figure 4(c) shows that [35] and [35] dominate arousal and [35] dominates valence in the felt model. The results of felt and perceived models are similar. But the valence-based feature is limited at in the felt model [35].

Figure 4(d) shows that [55, 113, 114] and [114] dominate arousal and [55] dominates valence in the physiological experiments. Both psychological studies (Figures 4(a), 4(b), and 4(c)) and physiological studies (Figure 4(d)) have the same results about arousal. But the faster tempi are found to decrease SDNN, the physiological valence [55].

Figure 4(e) shows that and dominate arousal and dominates valence. is directly proportional to arousal and is inversely proportional to arousal. As for valence, faster tempi increase SDNN, the physiological valence. This relation is also illustrated in the left part of Figure 3(a).

Figure 4(f) shows that and dominate both arousal and valence after the stimuli. The arousal is negatively correlated to . Hence if a fast and loud rhythm is removed, the subjects will feel relaxed. Fast tempi increase valence in Figure 4(e), the responses during music. This effect remains after the music stops if the intensity of the music is low.

In the valence perspective, and are considered as two major rhythmic parameters [27]. As for , there exist opposite comments. is positively correlated with valence in the felt models [35] but is negatively correlated with physiological valence since the faster tempo decreased SDNN [55]. The proper conclusion is that the role of is dependent on the context: both slow and fast tempi can derive different emotions (pp. 383–392) [1]. For example, both happy and angry music have fast tempi. As for , its contribution is not defined in the perceived model [27]. Although is negatively correlated with valence in the psychological experiments (pp. 383–392) [1], the same definition has opposite results: the firm rhythm derives positive valence (pp. 383–392) [1] and the firm rhythm derives negative valence [31]. Instead of music, if only the rhythm is considered, higher SDNN is observed with faster tempi in our work, as Figure 4(e) illustrated; and affects the responses of physiological valence after the music stops, as Figure 4(f) illustrated.

In the arousal perspective, and are two of the most dominant features. Our findings also show that if a fast, loud sound (drum loop 3) was removed, LF/HF would decrease, as illustrated in the right part of Figure 3(b). Although is usually considered as the most important factor in all features (pp. 383–392) [1], another research indicates is more important than in arousal [45]; our findings support this opinion, as illustrated in Figure 4(e). is considered as a valence parameter in Figures 4(a) and 4(b). But our results show that simple rhythms enhanced the physiological arousal.

5.3. Statistical Results

Table 3 collects all values and there are two significant results: the ANOVA of SDNN (C2) for L1~L4 and the -test of LF/HF (C1, C2) for L3. Although the significant value is more than 0.05, the left part of Figure 3(a) still shows that SDNN, the physiological valence, is positively correlated with the tempi of the drum loops. The right part of Figure 3(a) shows that SDNN is positively correlated with (). Both L2 and L4 have faster tempi and smaller energies. The left part of Figure 3(b) shows that LF/HF, the physiological arousal, is positively correlated with . L3, a fast drum loop with a very simple complexity, enhances arousal in the largest degree. The right part of Figure 3(b) shows that after the music stops, the fast and loud rhythms make the participants relaxed. Although there is no statistical significance in Figure 3(b), the -test result of L3, while and after listening to the music clip, reaches statistical significance ().

5.4. Limitations
5.4.1. The Three Rhythmic Parameters

Of the three elementary rhythmic features, tempo is the most definite [92]. is positively correlated with SDNN, the physiological valence (Figure 4(e)). In the other way, is negatively correlated with SDNN [55]. To integrate these two opposite findings, the psychological perspectives (Figures 4(a), 4(b), and 4(c)) are more suitable to explain this dilemma. That is, is a feature of arousal; both positive and negative valence can be measured; it depends on the context (pp. 383–392) [1].

To measure the complexity of a music or rhythm clip, mathematical quantities are better than adjectives. For example, the firm rhythms can derive both positive (pp. 383–392) [1] and negative [31] valence. We can acquire a number from Likert Scale [93]; however, it is not proper as an engineering reference. Of the mathematical studies about rhythmic complexity [115117], oddity [117] is the closest method for measuring the perceptual complexity. There is a more fundamental issue; two rhythms with different responses may have the same complexity measure.

The last parameter, the total energy of a waveform, is defined as , the summation of square of , where is the amplitude of the signal [94, 95]. But the intensity within the waveform may not be fixed: large or small variations suggest fear or happiness; the rapid or few changes in intensity may be associated with pleading or sadness (pp. 383–392) [1]. If the waveform is partitioned into small segments [118], the energy feature will be not only an arousal, but also a valence parameter (pp. 383–392) [1]. And the analytic resolution of the musical emotion will increase.

5.4.2. Nonlinearity of the Responses

The linear and inverse functions are simple, intuitive, and useful in a general overview. However, some studies assume that the liking (or valence) of music is an inverted-U curve, dependent on arousal (tempo and energy) [119121] and complexity [119, 120]. The inverted-U phenomenon may disappear with the professional musical expertise [122]. There is also a plausible transform of the inverted-U curve: although 120 bpm may be a preferred tempo for common people [123], the twin-peak curve was found in both a popular music database and physiological experiments [124]. Finally, the more complex curves are possible [3].

5.5. The Modified Model

and correlate with arousal, as Figures 4(a)4(d) and Figure 4(f) illustrated, but dominates the arousal [45], as Figure 4(e) showed. To integrate these models, let be the average energy of all beats. Thus is the product of and , and (9)–(12) can be modified as (13)–(16). This provides us with an advanced aspect.

The valence of musical responses keeps the same:

Both tempo and energy are arousal parameters now. Moreover, our model demonstrates that the complexity of rhythm is also an arousal parameter:

The valence after the musical stimuli is influenced by the energy:

Finally, both tempo and energy influence the arousal after musical stimuli, but tempo dominates the effect:

To integrate all psychological and physiological music theories, we summarize briefly that tempo, energy, and complexity are both valence and arousal parameters. The inverted-U curve [119121] is suggested for valence. Considering the valence after music, people prefer the music of low intensity. For the responses after the musical stimuli, the arousal correlates negatively with tempo and energy whereas tempo is the dominant parameter.

5.6. Criteria for Good Models

There are four criteria for a valuable model [22]. They are generalization (goodness-of-fit [21]) and predictive power, simplicity, and its relation to existing theories. In plain words, a model should be able to explain experimental data used in the model and not used in the model for verification. The parameters should be small relative to the amount of the data. The model should be able to unify two formerly unrelated theories. It will help to understand the underlying phenomenon instead of an input-output routine.

Our proposed model satisfies generalization, simplicity, and the relation to other theories. In our study, the selected model has the least metric (error) among all possible models. It has only three parameters (, , and ). And it fills the gap among the psychological and physiological experiments and the models of music and rhythm. The proposed model does not have enough predictive power. It was built for the subjects whose responses are able to be modeled; the nonpredictable class, one-third of all subjects, was excluded. There was no test data in this study. However, the model derived from the experimental data can be integrated well with the literature in this field, as Section 5.5 mentioned.

6. Conclusion and Future Work

A novel experiment, a novel algorithm, and a novel model have been proposed in this study. The experiment explored how rhythm, the fundamental feature of music, influenced the ANS and HRV. And the relationships between musical features and physiological features were derived from the algorithm. Moreover, the model demonstrated how the rhythmic features were mapped to the physiological valence/arousal plane.

The equations in our model could be the rules for music generating systems [125127]. The model could also be fine-tuned if the rhythmic oddity [117] (for complexity), small segments [118] (for energy), and various curves [3, 119121, 124] are considered.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


The authors thank Mr. Shih-Hsiang Lin heartily for his support in both program design and data analysis. The authors also thank Professor Shao-Yi Chien and Dr. Ming-Yie Jan for related discussion.