Abstract

This paper studies musical opus from the point of view of three mathematical tools: entropy, pseudo phase plane (PPP), and multidimensional scaling (MDS). The experiments analyze ten sets of different musical styles. First, for each musical composition, the PPP is produced using the time series lags captured by the average mutual information. Second, to unravel hidden relationships between the musical styles the MDS technique is used. The MDS is calculated based on two alternative metrics obtained from the PPP, namely, the average mutual information and the fractal dimension. The results reveal significant differences in the musical styles, demonstrating the feasibility of the proposed strategy and motivating further developments towards a dynamical analysis of musical sounds.

1. Introduction

For many centuries, philosophers, music composers, and mathematicians worked intensively to find mathematical formulae that could explain the process of music creation. As a matter of fact, music and mathematics are intricately related: strings vibrate at certain frequencies and sound waves can be described by mathematical equations. Although it seems not possible to find an expression that models the musical works, it is recognized that there are certain inherent mathematical structures in all types of music. Through the history of music, we have been faced with the proposal of formal techniques for melody composition, claiming that musical pieces can be created as a result of applying certain rules to some given initial material [112]. More recently, the growth of computing power made it possible to generate music automatically.

The concept of entropy was introduced in the field of thermodynamics by Clausius (1862) and Boltzmann (1896) and was later applied by Shannon (1948) and Jaynes (1957) in information theory [1315]. However, recently more general entropy measures were proposed, allowing the relaxation of the additivity axiom for application in several types of complex systems [1624]. The novel ideas are presently under a large development and open up promising perspectives.

The pseudo phase space (PPS) is used to analyze signals with nonlinear behavior. For the two-dimensional case it is called pseudo phase plane (PPP) [2527]. To reconstruct the PPS it is necessary to find the adequate time lag between the signal and one delayed image of the original signal. To determine the proper lag (or time delay) often the mutual information concept is used.

The Multidimensional Scaling (MDS) has its origins in psychometrics and psychophysics, where it is used as a tool for perceptual and cognitive modeling. From the beginning MDS has been applied in many fields, such as psychology, sociology, anthropology, economy, and educational research. In the last decades this technique has been applied also in others areas, including computational chemistry [28], machine learning [29], concept maps [30], and wireless network sensors [31].

Bearing these facts in mind, the present study combines the referred concepts and is organized as follows. Section 2 introduces a brief description of the fundamental concepts. Section 3 formulates and develops the musical study through several entropy measures and MDS analysis. Finally, Section 4 outlines the main conclusions.

2. Fundamental Concepts

This section presents the main tools adopted in this study, namely, the musical signals, the PPP, the fractional dimension, and the MDS.

2.1. Musical Sounds

In the context of this study, a musical work is a set of one or more time-sequenced digital data streams, representing a certain time sampling of the original musical source. For all musical objects, the original data streams result from sampling at 44 kHz, subsequently converted to a single (mono-) digital data series, each sample being a 32-bit signed floating value.

These sounds have a strong variability, making difficult their direct comparison in the time domain. In this line of thought, several tests were developed to obtain methods that establish a compromise between smoothing the high signal variability and handling the rhythm and style time evolution that are the essence of each composition. The Shannon entropy of the signals is shown to be an appropriate method: where is the set of all possible events and is the probability that event occurs so that .

For a bidimensional random variable the join entropy becomes

2.2. Pseudo Phase Plane

The PPS is used to analyze signals with nonlinear behavior. The proper time , for the delay measurements, and the adequate dimension of the space must be determined in order to achieve the phase space. In the PPS the measurement forms the pseudo vector according to The vector can be plotted in a d-dimensional space forming a curve in the PPS. If we have a two-dimensional space, and, therefore, the PPP is a particular case of the PPS technique.

The procedure of choosing a sufficiently large is formally known as embedding and any dimension that works is called an embedding dimension . The number of measurements should provide a phase space dimension, in which the geometrical structure of the plotted PPS is completely unfold and where there are no hidden points in the resulting plot.

Among others [26], the method of delays is the most common method for reconstructing the phase space. Several techniques have been proposed to choose an appropriate time delay [27]. One line of thought is to choose based on the correlation of the time series with its delayed image. The difficulty of correlation to deal with nonlinear relations leads to the use of the mutual information. This concept, from the information theory [32], recognizes the nonlinear properties of the series and measures their dependence. The average mutual information for the two series of variables and is given by where is a bidimensional probability density function and and are the marginal probability distributions of the two series and , respectively.

The index I allows us to obtain the time lag required to construct the pseudo phase space. For finding the best value of the delay, I is computed for a range of delays and the first minimum is chosen. Usually I is referred [2527] as the preferred alternative to select the proper time delay .

2.3. Fractal Dimension

The fractal dimension is a quantity that gives an indication of how completely a spatial representation appears to fill space. There are many specific methods to compute the fractal dimension [33, 34]. The most popular methods are the Hausdorff and box-counting dimensions. Here the box-counting dimension method is used due to its simplicity of implementation and is defined as where represents the minimal number of covering cells (e.g., boxes) of size required to cover the set analyzed. The slope on a plot of versus provides an estimate of the fractal dimension.

2.4. Multidimensional Scaling

MDS is a generic name for a family of algorithms that construct a configuration of points in a low-dimensional space from information about interpoint distances measured in high-dimensional space. The new geometrical configuration of points, preserving the proximities of the high dimensional space, facilitates the perception underlying structure of the data and often makes it much easier to analyze. The problem addressed by MDS can be stated as follows: given items in an -dimensional space and an matrix of proximity measures among the items, MDS produces a -dimensional configuration , representing the items such that the distances among the points in the new space reflect, with some degree of fidelity, the proximities in the data. The proximity measures the closeness (in MDS terms usually referred as similarities) among the items and, in general, it is a distance measure: the more similar two items are, the smaller their distance is.

The Minkowski distance metric provides a general way to specify distance for quantitative data in a multidimensional space: where is the number of dimensions, is the value of the th component of object , and is a weight factor.

For , if then (2.6) yields the Euclidean distance, and if then it leads to the city-block (or Manhattan) distance. In practice, the Euclidean distance is generally used, but there are several other definitions that can be applied, including for binary data [35].

Typically MDS is used to transform the data into two or three dimensions for visualizing the result to uncover the data hidden structure, but any is possible. Some authors use a rule of thumb to determine the maximum number of , which is to ensure that there are at least twice as many pairs of items than the number of parameters to be estimated, resulting in [36]. The geometrical representation obtained with MDS is indeterminate with respect to translation, rotation, and reflection [37].

There are two forms of MDS, namely, the metric MDS and the nonmetric MDS. The metric MDS uses the actual values of dissimilarities, while nonmetric MDS effectively uses only their ranks [38, 39]. Metric MDS assumes that the dissimilarities calculated in the original -dimensional data and distances in the -dimensional space are related as follows: where is a continuous monotonic function. Metric (scaling) refers to the type of transformation of the dissimilarities and its form determines the MDS model. If (it means ) and a Euclidean distance is used then we obtain the classical (metric) MDS.

In metric MDS the dissimilarities between all objects are known numbers and they are approximated by distances. Therefore, objects are mapped into a low-dimensional space, distances are calculated and compared with the dissimilarities. Then objects are moved in such way that the fit becomes better, until an objective function (called stress function in the context of MDS) is minimized.

In nonmetric MDS, the metric properties of are relaxed, but the rank order of the dissimilarities must be preserved. The transformation function obeys the monotonicity constraint for all objects. The advantage of nonmetric MDS is that no assumptions need to be made about the underlying transformation function . Therefore, it can be used in situations that only the rank order of dissimilarities is known (ordinal data). Additionally, it can be used in cases which there are incomplete information. In such cases, the configuration is constructed from a subset of the distances, and, at the same time, the other (missing) distances are estimated by monotonic regression. In nonmetric MDS it is assumed that and, therefore, are often referred as the disparities [4042] in contrast to the original dissimilarities , on one hand, and the distances of the configuration space, on the other hand. In this context, the disparity is a measure of how well the distance matches the dissimilarity .

With further developments over the years, MDS techniques are commonly classified according to the type of data to analyze. From this point of view, the techniques are embedded into the following MDS categories [35, 42]: (i) one-way versus multiway: in -way MDS each pair of objects has dissimilarity measures from different replications (e.g., repeated measures); (ii) one-mode versus multimode: similar to (i) but the dissimilarities are qualitatively different (e.g., distinct experimental conditions).

There is no rigorous statistical method to evaluate the quality and the reliability of the results obtained by an MDS analysis. However, there are two methods often used for that purpose: the Shepard plot and the stress. The Shepard plot is a scatter plot of the dissimilarities and disparities against the distances, usually overlaid with a line having unitary slope. The plot provides a qualitative evaluation of the goodness of fit. On the other hand, the stress value gives a quantitative evaluation. Additionally, the stress plotted as a function of dimensionality can be used to estimate the adequate -dimension (known as scree plot). When the curve ceases to decrease significantly the resulting “elbow” may correspond to a substantial improvement in fit.

Beyond the aspects referred before, there are other developments of MDS that include Procrustean methods, individual differences models (also known as three-way models), and constrained configuration.

In the Procrustean methods the data is analyzed by scaling each replication separately and then comparing or aggregating the different MDS solutions. The individual differences models scale a set of dissimilarity matrices into only one MDS solution. The procedure of constraints on the configuration (which Borg and Groenen called “confirmatory MDS” [43]) is used when the researcher has some substantive underlying theory regarding a decomposition of the dissimilarities and, consequently, tries to restrain the configuration space.

3. Study of Musical Sounds

This section develops the musical study using entropy applied to a large sample of representative musical works. Once having the entropy measurements, the corresponding time lags and the PPP are calculated. Finally, an MDS analysis is performed using two alternative criteria, namely, based on mutual information and fractal dimension.

3.1. Entropy Analysis of Musical Compositions

For the calculation of the entropy is considered a rectangular window of duration that slides over time capturing a limited part of the signal evolution. Each new window overlaps 50% with the previous one. For the signal captured in the window a histogram of relative frequency of amplitudes is obtained and calculated. Several experiments demonstrated that a sampling window with width represented a good compromise between the original signal’s frequency (tenths of microseconds) and the musical piece’s duration (hundreds of seconds).

Figure 1 shows the evolution of several musical sounds viewed through the entropy versus time for a sliding window of . The entropy curves represent four different compositions, namely, The Beatles: “Yellow Submarine,” Ella Fitzgerald: “Night and Day,” Mozart: “KV527 Minuet Don Giovanni,” and Stevie Wonder: “For Your Love.”

3.2. Pseudo Phase Plane of Entropy Curves from Musical Compositions

Having established the concept of time evolution of the entropy measure for musical compositions, the question of how the entropies of compositions with different “types” are interrelated was investigated. Several music titles from different “types” were selected: “Classical” (49 titles), “Easy” , “Electro” , “Jazz” , “Brazilian Music” , “Portuguese Music” , “Pop and Rock” , “Rhythm Blues” , “Reggae” , and “Slow Rock” . These samples lead to a population of music titles.

For each signal derived from the 426 compositions, the average mutual information was calculated. For example, Figure 2 shows the average mutual information versus lag of four musical compositions—The Beatles: “Yellow Submarine,” Ella Fitzgerald: “Night and Day,” Mozart: “KV527 Menuet Don Giovani,” and Stevie Wonder: “For Your Love.” The minimum of the average mutual information and the corresponding delay yield , respectively. To reconstruct the PPP, the first minimum of was considered. The corresponding PPPs are represented in Figure 3.

Usually is just calculated for the PPP reconstruction. However, the time lag represents a “memory” of previous parts of the time series and, therefore, this information is related with the fractional dynamics embedded in the music [4446]. Consequently, the value of and the characteristics of the PPP chart obtained for are important details to be included in the MDS maps to be formulated in the next subsection.

3.3. Multidimensional Scaling Analysis of Musical Compositions

In order to reveal hypothetical relationships between the musical compositions the MDS technique is used. Two alternative metrics to compare objects and were adopted, namely, where is the total number of music, defined in (3.1) is based on the minimal of the average mutual information , and defined in (3.2) is based on the fractal dimension of the reconstructed PPP.

For each of the two indices a symmetrical matrix with 1’s in the main diagonal was calculated and the MDS maps obtained.

Figure 4(a) shows the locus of the classic compositions obtained by MDS using for the dimension . The locus obtained with this exponential type of metric forms a curve. Due to space limitations we are only depicting the locus obtained for some individual types of music. The tests developed show that each type of music occupies a certain segment in the curve obtained for all the musical compositions (Figure 4(b)). Figures 4(c) and 4(d) depict two tests computed to evaluate the consistency of the results obtained by MDS analysis. The Shepard plot (Figure 4(c)) shows the fitting of the 3D configuration distances to the dissimilarities. The value of the stress function versus the dimension is shown in Figure 4(d), that allows the estimation of the adequate -dimension. An “elbow” occurs at dimension two for a low value of stress, which corresponds to a significant improvement in fit. From the scree plot can be concluded that the improvement obtained for the increasing of the -dimension from to is very low. Therefore, the 2D MDS configuration is appropriate.

In this line of thought, Figures 5(a)5(c) show the 2D locus for the Classic, Pop and Rock, and Reggae types of music, respectively. The Classic music compositions (Figure 5(a)) occupy a segment of approximately 80% of the curve obtained for all the musical compositions tested (Figure 5(d)). This segment begins near one end of the curve. The Pop and Rock music is located over a segment of approximately 80% of the curve beginning near the other end (Figure 5(b)). Therefore, approximately 60% of the positions for these two types of music are superimposed in the center of the curve. For the Pop and Rock most of the positions are concentrated in the half of the segment positioned at the opposite side of the classic music. The Reggae music compositions are located over a limited zone near the center of the curve (Figure 5(c)). Figure 5(d) shows the curve obtained for the 426 musical titles tested. The Jazz zone is centered approximately in the middle of the curve and corresponds to the superimposed zone of the Classic and the Pop and Rock. The Rhythm Blues titles are located approximately in the same zone of that corresponding to the Reggae. The Slow Rock and the Electro types occupy approximately the same segment that corresponds to the Classic music, nevertheless in a scattered way near the end of the curve. The Easy type occupies a shorter segment than the one occupied by the Slow Rock and the Electro. Finally, the Brazilian and the Portuguese compositions occupy a segment that corresponds approximately to the Reggae one, but with a slightly shift to the side of the Classic music. The shift is more pronounced for the case of the Portuguese music.

Figure 6 depicts the Shepard plot that confirms the good fitting of the 2D configuration distances to the dissimilarities.

Figure 7 shows the locus of the musical compositions obtained by MDS using the metric . Figures 7(a)7(c) show the locus for the Classic, Pop and Rock, and Reggae types of music, respectively. The Classic music compositions form a segment located in one end of the curve (Figure 7(a)). The Pop and Rock musical opus occupies the most part of the curve in a scattered way, but with a slightly superimposition over the Classic (Figure 7(b)). The Reggae music compositions are located on a limited zone superimposed over the Classic and the Pop and Rock compositions (Figure 7(c)).

Figure 7(d) shows the locus of the 426 musical titles. In general the relative positions for the others types of music are similar to those obtained for . Nevertheless the positions achieved with the metric are represented in a curve shorter than the one obtained with that occasionally can make the analysis difficult.

Figure 8 shows the scree and Shepard plots to evaluate the results obtained by MDS using . Again, an “elbow” occurs at dimension two for a low value of stress (Figure 8(a)), which corresponds to a significant improvement in fit. Additionally, the Shepard plot (Figure 8(b)) shows the fitting of the 2D configuration distances to the dissimilarities.

The results obtained with the proposed tools, namely, the MDS and the PPP, together with the tested metrics proved to be assertive methods to analyze the musical compositions.

4. Conclusions

Through the history of music many authors tried to find mathematical formulae that could explain the process of music creation. In this perspective, the study analyzes the musical compositions from a mathematical view point. The representation in the time domain of the music compositions presents characteristics which makes difficult their direct comparison. To overcome this limitation the Shannon entropy was used together with other tools, namely, the pseudo phase plane and multidimensional scaling. These tools were applied to an aggregate of different type sets of music compositions. The proposed tools proved to be assertive methods to analyze music. In future work, we plan to pursue several research directions to help us understand the behavior of the musical signals. These include other techniques to measure the similarities of the signals.

Acknowledgments

This work is supported by FEDER Funds through the “Programa Operacional Factores de Competitividade-COMPETE” program and by National Funds through FCT “Fundação para a Ciência e a Tecnologia” under the Project FCOMP-01-0124-FEDER-PEst-OE/EEI/UI0760/2011.