A Basis for Characterizing Musical Genres

(1)

A Basis for Characterizing Musical Genres

Roelof A. Ruis 6285287

Bachelor thesis Credits: 18 EC

Bachelor Artificial Intelligence University of Amsterdam Faculty of Science Science Park 904 1098 XH Amsterdam Supervisors dr. A.K. Honingh

Institute for Logic, Language and Computation Faculty of Science University of Amsterdam Science Park 107 1098 XG Amsterdam dr. M.W. van Someren Informatics Institute Faculty of Science University of Amsterdam Science Park 107 1098 XG Amsterdam July 4th, 2014

(2)

Abstract

By exploratory means, a method is presented for providing musical genre characterizations using songs from two MIDI corpora. This characterization is first based on global features and later refined using interval sequences derived from melodies. In the first part of the research a top 10 of characteristic global features per genre is established through correlation and clustering. In the second part this correlation is again used to determine a top 10 of characteristic interval sequences. Analysis is then presented as to show meaningful interpretation of the data, resulting in small genre characterizations.

(3)

5 Part 2 - Interval Sequences 12 5.1 Method . . . 12 5.1.1 Preprocessing . . . 13 5.1.2 Interval Sequences . . . 13 5.1.3 Sorting on Occurrence . . . 14 5.2 Interpretation . . . 14 6 Conclusion 16 7 Discussion 16 7.1 Part 1 . . . 17 7.2 Part 2 . . . 17 7.3 General . . . 18 A JSymbolic Features 20 B Results Part 1 21 C Results Part 2 27

(4)

1 Introduction

When humans talk about music, they tend to group songs with comparable characteristics as belonging to the same genre. Often people can recall some of the musical aspects on which they base this grouping but only tend to name su-perficial features like instrumentation. It would thus be interesting to see if a more exact genre description could be constructed.

Musicologists have already tried to describe genres using literature on music and a good ear, but nowadays it is possible to use computational methods to analyze larger corpora and find more refined musical features. In light of this, mu-sical genre classifiers have been built which ob-tain high accuracy on mimicking genre grouping behavior. They do however not provide informa-tion on genre-specific features, which are exactly what musicologists are eager to discover. It is therefore desirable to aid the musicologist with a method to extract musicologically clearly inter-pretable features related to genre, or better still, a computationally extracted genre description.

Of course the difficulty in extracting such a description should not be underestimated; it is difficult to define beforehand the features by which a genre should be described. Furthermore, it could be that some genres may be defined more by meta-aspects of songs that can not be ex-tracted from the MIDI context such as artist, year of release and geographical origin of a song. Because this research will work with MIDI files which contain no meta-data, such features will be left out and might be explored in future re-search.

It is sensible to look at both global and sur-face features. Global features can be seen as aspects of songs to which an overall value can be given which is applicable to the whole song, such as electric guitar fraction or average time between attacks. In contrast, surface features focus on the actual melodic content and try to describe certain patterns therein. Either these global or surface aspects might contribute to more accurate genre descriptions and due to the

difference in analysis for both, the research is split into two parts.

During the first part, global features provided by a feature extractor will be examined using principal component analysis and various other machine learning techniques, working to-wards a high level genre description.

The second part of the research will focus on finding genre specific melodic structures, which can be used as detailed genre descriptors. It is to be expected that these melodic structures themselves can then not only be used descript-ively but also to improve the accuracy of existing classification methods. The interpretation of the results will incorporate results from part one as to check their communalities.

Because other researchers have conducted very little research into finding explanatory fea-tures, the research presented here will be mainly exploratory, providing extensive documentation of the investigated paths and decisions that have been made. Furthermore, no statistical tests will be used at this stage, so if interesting results sur-face, more meticulous further research will be re-quired.

This documentation, therefore serves two goals. Firstly, trying to provide features, ex-pressed in terms of global features and later refined and completed with melodic structures, gained from a corpus of genre specified MIDI songs, capable of meaningfully characterizing the different genres in the corpus. Secondly, it hopes to provide material for supplementary studies, which might be suitable for other than musicolo-gical research. For instance social network ana-lysis, where meaningfull features play a role in characterization of groups of people.

2 Related Work

Much research has been done into the machine based recognition of musical genres using only MIDI data [8] [3], a combination of MIDI and au-dio features [2] or features purely based on auau-dio

(5)

[10]. All of the systems presented in these pa-pers first apply some form of feature extraction whereafter they perform a classification based on those features. The classification accuracy is then evaluated and improvements or new ways of classifying music are claimed.

These systems, however, do not provide any means of understanding what characterizes a musical genre but only how well it can be clas-sified based on arbitrary features. Sturm [11] argues that because there are more independent variables changing between particular genres in a dataset, classification is unreliable and other types of experimental designs are required to un-derstand how a genre can be characterized. He mentions, among others, inspecting features and answering the question “At what is the system looking to identify the genres used by music?” (p. 376).

A further motivation for this research is the final remark of McKay and Fujinaga [8] in a small genre classification research stating that “Fur-ther study of which features were selected by which specialist classifier ensembles could also be of great musicological interest.” (p. 530). This further emphasizes the need for a thorough look into feature meaning and selection.

The high level features required in the first part of the research will have to be extracted by a program capable of handling MIDI songs. The program used in this paper, created by McKay [7] is called JSymbolic and is capable of extract-ing up to 111 features which meanextract-ings are ex-plained in the paper.

McKay also notes “The library of features used in this thesis should be seen as a work in progress that can continually be expanded and refined [...]” (p. 62), stressing that the fea-tures extracted by JSymbolic could used some refinement; they do not provide the musicolo-gist with much deep insight into song structure and melodic and rhythmic sequences but instead define a coarse value for, for instance, ’average note duration’ or ’average melodic interval’.

The second part of the research exploring the use of small melodic structures in genre charac-terisation, will use an idea presented by Conklin [5]. The paper presents a method for combining a series of notes from a melody into a segment and assigning these segments to a class, based on certain rules about their content. These melodic segment classes are then used for style discrimin-ation of different melodies and results well above random chance are achieved. Honingh et al. [6] also use the aforementioned concept of melodic segment classes to distinguish between tonal and atonal music and develop a model of pitch class progression. Both these studies thus show that inspecting musical features at this scale might not only lead to determining genre distinctions but also provide more insight into which inform-ation is hidden in the structure of the music on a very detailed level.

3 The Corpora

For this research, two genre annotated MIDI cor-pora were used. Firstly the Ballroom Dance Cor-pus [4], from here on indicated with ’BAL’, con-taining 6 types of ballroom dances which struc-ture can be observed in table 1. Secondly the Bodidharma MIDI Corpus [7] is used, indicated with ’BOD’. The large version of the BOD cor-pus consists of 38 genres with a hierarchically defined structure with a maximum depth of four. There is, however, also a small subset of this large corpus consisting of 9 sub genres divided into three main genres that will be used here, but the method presented in this paper could be extended to cover the large version of the BOD corpus. The structure of the reduced BOD cor-pus as used in this research can be seen in table 2.

(6)

Genre Nr. of songs Bossa Nova 36 Mambo 15 Merengue 24 Rumba 20 Salsa 7 Tango 26

Table 1: Structure of the BAL corpus with song counts

Because both corpora come prelabeled and are used in other researches, it is assumed that this labeling is correct.

Main genre Subgenre Nr. of songs

Popular 75 Hardcore Rap 25 Punk 25 Trad. Country 25 Jazz 75 Bebop 25 Jazz Soul 25 Swing 25 Classical 70 Baroque 25 Modern Classical 25 Romantic 20

Table 2: Structure of the BOD corpus with song counts

4 Part 1 - Global Features

4.1 Method

4.1.1 Principal Component Analysis First, features are generated for all songs in both the BAL and BOD corpus using the JSymbolic extractor, producing a csv file with data for each separate genre. A complete list of the feature names is included as listing 1 in appendix A.

To get an intuition about the high dimen-sional structure of the data and the possible overlap of genres, principal component analysis (PCA) is used to map the high dimensional data to a 2D space. This also provides means to visu-ally detect outliers or anomalies and a first pos-sible step to finding characteristic features.

Visualization The BAL corpus is inspected first. Figure 1 shows all songs in this corpus mapped to a 2D principal component (PC) space using the two most dominant principal compon-ents as axes. All features were used in the cal-culation of these components. It can clearly be observed that most genres form quite compact clusters but that there is still a decent amount of overlap. Because all features were used it is to be expected that the clusters will get better when a decent subset of features is selected.

−10 −5 0 5 10 15 −8 −6 −4 −2 0 2 4 6 8 10 Principal Component 1

Principal Component 2 Bossa Nova

Mambo Merengue Rumba Salsa Tango

Figure 1: BAL corpus in 2D PC space based on all features.

A few other things can be observed from this visualization. At first, the salsa songs are not forming a coherent cluster, which is also due to the fact that very few salsa songs were provided by the BAL corpus. For this reason the salsa genre is dropped for the rest of part 1 of the research.

Secondly, the tango songs seem to form two distinct groups. Listening to the songs indicated that the group in the upper right corner consists of tango songs played on solo piano while the other tango songs are performed with a multiple instrument setup. Because it is likely that this distinction will affect the precision of the genre characterization, the tangos performed on piano solo are also removed from the data. The new BAL corpus with salsa and piano tango songs removed is shown in figure 2.

(7)

−10 −8 −6 −4 −2 0 2 4 6 8 −5 0 5 10 Principal Component 1 Principal Component 2 Bossa Nova Mambo Merengue Rumba Tango

Figure 2: BAL corpus in PC space without salsa and tango piano based on all features.

The BOD corpus shows various degrees of such cluster formation. Clear cluster separation with a bit of overlap can be observed for the main level (figure 3) and the ’popular’ sub level (figure 8 in appendix B). Furthermore a clear distinc-tion can be seen between bebop and swing in the ’jazz’ sub-genre (figure 9 in appendix B), while jazz soul does have a lot of overlap with bebop. The ’classical’ sub level (figure 10 in appendix B) shows the most overlap of all, indicating that the features used might not be working too well for separation of classical music or that a smal-ler selection of features might be required to get good separation. No other anomalies or strange outliers can be seen so removing any songs from the BOD corpus beforehand is not necessary.

−12 −10 −8 −6 −4 −2 0 2 4 6 8 −12 −10 −8 −6 −4 −2 0 2 4 6 8 Principal Component 1 Principal Component 2 Popular Jazz Classical

Figure 3: Top level of BOD corpus in PC space based on all features.

10 20 30 40 50 60 70 80 90 100 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 Dimension Contribution PC vector 1 PC vector 2 PC vector 3

Figure 4: Contribution to dimensions of the first 3 PCA vectors for the BAL corpus.

Inspecting Dimensions The PC axes are formed by a linear combination of the feature dimensions and it might be that they can ex-plain the forming and position of the clusters fairly well. Results indicate however that too many features are contributing to one PC axis to be of any value. Figure 4 and 5 show the dimensions for the first 3 PCA vectors plotted against their individual dimensional contribu-tion. If only a low number of features would contribute to the PC axis it is to be expected to get a few high contribution values for the very first dimensions and thereafter a steep decline with only very small contributions for the other dimensions. As the figures show, this is not the case for there is no steep decline and no large contribution by the first couple of feature values.

Visual Aid While the actual PC axes are thus difficult to interpret by looking at the fea-ture decomposition, it might be possible for a trained musicologist to use a PC plot as an aid for deriving genre characteristics by ear. Look-ing at figure 2, one could select a range of songs by varying the value of principal component 1

(8)

while keeping the value of principal component 2 at a fixed value. Listening to this range of songs might then provide insights into aspects of the music that change throughout.

Concluding we have seen that while PCA is a method suited for detecting outliers and anom-alies in the corpus and visualizing the forming of clusters it can not be used for detecting charac-teristic features, although it could be used as a visual aid. 10 20 30 40 50 60 70 80 90 100 0 0.005 0.01 0.015 0.02 0.025 0.03 Dimension Contribution PC vector 1 PC vector 2 PC vector 3

Figure 5: Contribution to dimensions of the first 3 PCA vectors for the top level of the BOD cor-pus.

4.1.2 Unsupervised Clustering

To discover if the JSymbolic features can really be used for recognizing (and therefore charac-terizing) genres a solid measure is needed. Al-though clustering of genres can already be ob-served in the PCA plots, it provides no exact measure for the amount of separation between the different genres. It is although possible to use unsupervised k-means clustering to detect how well clusters form in the same way as the genre labels indicate for different feature sets. This way it is possible to compare clustering of different feature sets and see if the clustering im-proves. For characteristic features (features that are in particular suited for separating one genre

from the others) such an improvement is to be expected.

Evaluating clustering Unsupervised cluster-ing is findcluster-ing clusters of songs which are close together in feature space. It can however be that the found clusters do not represent the original genre groups. To find out how well the found clusters match the original genres the following measure is devised: It is checked to which genre the majority of songs in a cluster belongs (scaled by the total amount of songs in the dataset for that genre), which is then assumed to be the ’cor-rect’ class for that cluster. This is done for every cluster. Hereafter, Accuracy and recall are meas-ured and the F1-score is derived for each genre individually. Because the cluster centroids are randomly initialized on each run, 1000 cycles of the clustering algorithm are ran to get a reliable mean. There are as many cluster centroids ini-tialized as there are different genres in the meas-ured corpus. Genre F1 Bossa Nova 0.954 Mambo 0.119 Merengue 0.791 Rumba 0.687 Tango 0.870

Table 3: F1-scores BAL using all features.

Clustering with all features Clustering the BAL corpus yields the F1-scores displayed in table 3. A score of 1 means complete separation of a genre into its own cluster and a score of 0 means that all songs for a genre got mixed up in other genre clusters. The values of this clus-tering correspond to the visual results obtained from the PCA analysis as shown in figure 2. Looking at that figure, the low value for mambo can be explained from the overlap of almost all of the mambo songs with merengue or rumba. These genres take precedence and thus mambo songs are almost always classified under a

(9)

differ-ent genre.

Genre F1

Popular 0.730

Jazz 0.741

Classical 0.845

Table 4: F1-scores BOD main using all features.

Genre F1

Hardcore Rap 0.888

Punk 0.971

Trad. Country 0.918

Table 5: F1-scores BOD popular using all fea-tures.

Genre F1

Bebop 0.705

Jazz Soul 0.506

Swing 0.855

Table 6: F1-scores BOD jazz using all features.

Genre F1

Baroque 0.701

Modern Classical 0.229

Romantic 0.529

Table 7: F1-scores BOD classical using all fea-tures.

Within the BOD corpus, the top level as well as the three sub levels can be clustered and scored individually in the same way. Tables 4, 5, 6 and 7 show the F1-scores for clustering these genres. It can be observed that most of the genres already separate very decently on their own except for modern classical music. The fact that some genres are not well separable means that the features used here cannot fully explain the difference between them or that too many features are used.

4.1.3 Point-Biserial Correlation

To get a better understanding of which features predict well for which genre and thus work to-wards finding actual characteristic features, cor-relation between the features and the genres can be calculated. Because the features have con-tinuous values and the songs can be regarded as being dichotomous (a song either belongs or does not belong to the measured genre) point-biserial correlation can be used.

For each individual genre this correlation is calculated, where all songs within the measured genre are said to be in class 1, and all other songs in class 0. Measuring the point-biserial correla-tion for all features and all genres in the BOD corpus yields a top 10 correlating features for each genre. Table 8 shows this top 10 for bossa nova together with the correlation scores.

Feature rpb

65. Rel. Note Density of Highest Line* 0.712

5. Avg. Note Duration* 0.697

9. Avg. Time Between Attacks* 0.693

72. Rhythmic Variability* 0.649

70. Repeated Notes** 0.647

25. Importance of Bass Register 0.639

37. Melodic Thirds** -0.636

7. Avg. Nr. of Independent Voices 0.635

79. Str. of Strongest Rhythm. Pulse* 0.627

15. Comb. Str. of Two Str. Rhythmic Pulses* 0.612

Table 8: Top 10 highest point-biserial correlation scores for the bossa nova genre (in BAL).

The features in the table are highly correl-ated with the bossa nova genre when checked against all other genres in the BAL corpus, meaning that these features are likely to char-acterize bossa nova based on differences it has with the other genres in the corpus.

Using the information provided in the re-search by McKay, a detailed description of the individual features can be given. As an ex-ample the first feature ’Relative Note Density of Highest Line’ will now be discussed. A mu-sical piece often contains different lines or voices which have an average pitch. The highest line

(10)

refers to the voice with the highest average pitch. The relative note density of the highest line now is the number of notes in the voice with the highest average pitch divided by the average number of notes in all other voices. The high cor-relation is indicating that, relative to the other genres, bossa nova has a large amount of notes in its high lines compared to its lower lines. Section 4.2 will elucidate more on the interpretation of the other features and other genres.

Furthermore some first remarks on melody and rhythm can already be made. The features marked with one asterisk are related to rhythm and the features marked with a double asterisk are related to melody. This shows that bossa nova has high correlations with features relating to both rhythm and melody, but which could use much refinement. For instance if not much melodic thirds are observed one might want to know which intervals then ´are most likely to oc-cur. This stresses the need for part 2 of the re-search.

Verifying correlation results To show that the correlating features can indeed be regarded as being characteristic for their corresponding genre, two methods might be used. First the F1-score of the unsupervised clustering using only the selected features listed above can be com-puted. The results are shown in table 9.

Genre F1 Bossa Nova 0.947 Mambo 0.210 Merengue 0.565 Rumba 0.524 Tango 0.384

Table 9: F1-scores BAL using Bossa Nova cor-relating features.

When comparing these results to the values given in table 3 it can be observed that the dif-ference between bossa nova and the second best scoring genre has more than quadrupled. This means that the split between the bossa nova

genre and the other genres is now rather signi-ficant. The fact that the F1 score for bossa nova is somewhat lower than in table 3 is understand-able because bossa nova already got separated decently when all features were used.

Because PCA could be used as a method of visualizing the data it can be used as a check on how the correlated features have split the data. Figure 6 shows the BAL corpus in PC space us-ing only the features from table 8, and a strong separation between bossa nova and the other genres can clearly be observed. This and the foregoing check using F1 values are a strong in-dication that this particular feature subset char-acterizes just bossa nova. Because the goal is to find characteristics of each individual genre, this is a very appealing result.

−4 −2 0 2 4 6 8 −3 −2 −1 0 1 2 3 4 Principal Component 1 Principal Component 2 Bossa Nova Mambo Merengue Rumba Tango

Figure 6: BAL corpus in PC space with Bossa Nova correlating features.

The top 10 features for each genre together with their F1 scores are included in appendix B and the reference scores for clusters using all features are in tables 3, 4, 5 and 6, 7. Res-ults shown here for bossa nova in the BAL cor-pus generalize to almost all other genres in the other corpora used: The F1-score of the selec-ted genre increases relative to the other genres when its correlating features are used when com-pared with clusters in which all features were used. Some genres (such as tango, table B) show

(11)

a strong relative increase while with others such as merengue (table B), this is less obvious, in the case of merengue also showing a big increase for mambo. Another anomaly is observed in table B, where the features correlating with bebop actu-ally provide a more distinct clustering of swing. In these cases that correlating features fail to im-prove clustering results, these features might not provide a decent genre characterization and are therefore to be interpreted with care.

4.2 Interpretation

Although quite some definitions of genres are published online in the Grove Music Online Dic-tionary [1], these texts do contain only a very small amount of information about specific genre characteristics and instead focus more on the so-cial and cultural backgrounds.

It is thus very valuable to work towards genre characterizations as to provide additional in-formation on the understanding of genres. The following paragraphs provide a concise interpret-ation of the results of some genres; a full fledged interpretation is left to musicologists. The dis-cussed results can be found in table 8 for bossa nova and the leftmost tables of appendix B for the other genres.

4.2.1 BOD main

While the results hopefully present new know-ledge to be used for more precise genre de-scriptions, findings matches with existing beliefs strengthen the validity of the method presented. Especially in the BOD main corpus, such famil-iar features are observed, likely because these broad genres capture global characteristics. Popular Table B shows the features correlat-ing with the popular music in relation to jazz and classical music. Almost all features match with an intuitive description of popular music. Firstly the use of electric instruments and electric guitar in particular can be observed. Secondly popular music has a strong tonal focus and is likely to

use tones only present in a certain scale, which is indicated by positive correlation with ’Nr. of common pitches’, ’Most common pitch preval-ence’ and negative correlation with ’Pitch vari-ety’. Furthermore a negative correlation with the range of highest line can be observed, which in pop music is often the melodic line which tends to be restricted in range.

Jazz In jazz music, a significant correlation with saxophone and brass fraction can be ob-served. The presence of melodic tritones to-gether with a high pitch variety is an indication of the more complex melodies often found in jazz. Classical Characteristics for the classical genre show the absence of electric instruments and percussion, and ’importance of high re-gister’, absence of ’importance of bass register’ and high value for ’primary register’ all three indicate that relative to pop and jazz, much is going on in the high lines.

4.2.2 BOD popular

Hardcore Rap The hardcore rap genre is characterized by relatively long songs, where much movement in minor and major seconds is observed (’chromatic motion’ and ’stepwise mo-tion’). Both negative ’note density’ and ’average time between attacks’ indicate that these songs are relatively slow paced.

Punk Punk songs have a strong negative cor-relation with ’Duration’ meaning that they are often short songs. Furthermore they use a lot of repeated notes and arpeggiation. An interesting negative correlation that might not be so eas-ily explained is that with ’Melodic thirds’ which seems to indicate that punk music lacks melodic thirds as opposed to hardcore rap and traditional country.

Traditional Country Traditional country has the highest difference in melodic material of

(12)

all, indicated by ’pitch variety’ and ’pitch class variety’. Furthermore a strong correlation with the use of melodic thirds can be observed. 4.2.3 BOD jazz

The BOD jazz corpus provides the most diffi-cult to interpret features and, as shown before, because the bebop features seemed to be unreli-able, interpreting the results for the jazz corpus is left to others.

4.2.4 BOD classical

Baroque Within the classical corpus, baroque stands out for its restricted range, restricted pitch variety and restricted melodic tritones. Together with ’most common melodic interval prevalence’, indicating that there is a certain melodic interval that is often reoccurring, one can conclude that this genre is focused around tonality. Unfortunately no features concerning rhythm show up.

Modern Classical Although for modern clas-sical change of meter has a quite strong correla-tion, most other features show very low values. It is therefore difficult likely that these features are not very useful.

Romantic The romantic genre seems to be characterized by rhythmic looseness, not only indicated by the feature with the same name but furthermore because of the lack of a strong ’second strongest rhythmic pulse’.

4.2.5 BAL

Bossa Nova As discussed earlier, bossa nova has a relative high amount of notes in the high lines. Contrasting though, there is also a high correlation with ’Importance of Bass Re-gister’. A strong positive correlation with aver-age note duration and averaver-age time between at-tacks might be indicative of relatively slow music

when compared to the other genres in the cor-pus. Furthermore a lack of melodic thirds shows up, which is a feature that might be verified by the second part of the research.

Mambo The F1 score for mambo, although increased compared to table 3 is an indication that features for mambo will not be very reli-able. Though, the occurrence of melodic fifths is something that again can be examined in part 2. Merengue The first feature for merengue ’voice equality - number of notes’ is described as “Standard deviation of the total number of notes in each channel that contains at least one note” [7] (p.66). A high positive correlation means that some instruments will play much notes while oth-ers will play very few. The overall note density is high, and the average note duration is negatively correlated, indicating many short notes.

Rumba Rumba has a high correlation with both pitch variety and pitch class variety, indic-ating that it is likely that its either non-tonal or often uses modulation within a song. Fur-thermore it has quite a strong correlation with melodic thirds which will be inspected in part two.

Tango A lack of unpitched instruments and percussion in general is observed. The use of polyrhythms is notable, together with the high pitch variety and chromatic motion, creating the expectation of interesting complex melodies. Both the use of chromatic motion as well as the lack of arpeggiation can be verified during part two.

5 Part 2 - Interval Sequences

5.1 Method

Now that it has been shown that character-istic features can be extracted on a global level,

(13)

the attention in this part shifts to character-istic melodies obtained on the local level. It is known that rhythm plays a significant role at least in characterization of ball room dances [4] and it would be interesting to discover if they can be characterized by melody as well. Most songs however consist of multiple voices of which only one plays the melody. The reason that only melody voices will be analyzed is because the melody plays a dominant role and is often re-called the best, increasing the chance that genres can best be characterized by information extrac-ted from melody. Furthermore multiple notes playing at the same time are rare in melodies, which eases determining by which notes it is formed.

A melody consists of consecutive notes which have intervals in between them. Though, dif-ferent melodies might map to the same interval (Both C to G and A to E have an interval of 7 in between them) so using these interval sequences captures the shape of the melodies instead of the exact pitches and key. This is similar to the way a derivative captures the movement of the ori-ginal function while disregarding its vertical po-sition.

5.1.1 Preprocessing

Preprocessing the Corpus Because there was no corpus with prelabeled melodic lines that would work well in MATLAB it required some preprocessing of the existing corpora. This pre-processing was rather time consuming so melody extraction was only done for the BAL corpus and the BOD corpus was dropped. Selecting melodic lines was done through listening to the MIDI files and extracting the correct tracks by hand. In the case of melodies with more than one note playing at the same time, the upper voice was chosen because it is in general perceived as the main melody. For each genre, this time includ-ing salsa, 10 to 20 melodic lines were selected, depending on the available files. It might be pos-sible to automate this process of melodic track selection, for instance with a method presented

by Rizo [9].

Interval extraction From the previously ex-tracted melodic lines, the intervals in semitones were calculated. Figure 7 shows such an ex-ample extraction for a random melody: the num-bers indicate the intervals relative to the pre-ceding note. Notice that an interval sequence of length n corresponds to a melody of length n+1. During this procedure, the rhythmic information about the melody is lost. This interval series is the data used for calculating the counts of the different micro-melodies. Interval series will be written between curly brackets like so {1, 2, -3, 1, 5, -1, -2} for the example sequence.

-2

þ

-1

5 -3

2

1

Figure 7: Melodic excerpt with relative pitch in-tervals

The feature vector consists of interval se-quences of length 1, 2 and 3. Taking any longer sequences is likely to provide very sparse results while increasing extraction time. In the feature vector all intervals from {-12} up to {12}, {-12,-12} up to {12,{-12,-12} and {-7,-7,-7} up to {7,7,7} are included. The triple interval sequences are only measuring a maximum distance of seven semitones (a perfect fifth) because computation would otherwise take rather long. All of the se-quences can now be scored by counting their oc-currences in the series of pitch intervals for each melody.

5.1.2 Interval Sequences

As well as in the first section, it is possible to find interval sequences that explain a genre particu-larly well by calculating the point-biserial correl-ation. This is done in the same way as shown in part 1, grouping the melodies from one genre in one class.

(14)

It is although possible that a particular se-quence scores a high correlation because it only exists in one certain melody. Such a sequence should nevertheless not be seen as characteristic because it does not occur in multiple melodies pertaining to one genre. To account for this ef-fect, besides a column indicating the sequence and correlation scores, a third column is added to the results. This column shows the percentage of occurrence of the sequence among all melod-ies of that genre. If now sequences can be found that score high on both measures, it is a strong indication that these sequences are characteristic sequences illustrative for the genre as a whole. It should be noted that the foregoing is also valid for sequences with a negative correlation and a low occurrence: negative correlation implies that a genre stands out from the others by lacking a particular sequence. Sequence rpb Occurrence {-2,2,-2} 0.478 65% {-2,2} 0.420 80% {2,-2,2} 0.351 50% {2,0,-2} 0.340 45% {-1,-1,-3} 0.334 15% {-7} -0.331 30% {0,-2,2} 0.328 50% {-2,2,0} 0.323 45% {1} 0.322 85% {7,-2,2} 0.304 15%

Table 10: Top 10 characteristic melodic se-quences for bossa nova in BAL sorted on cor-relation value.

Table 10 shows the 10 sequences with the highest correlation and occurrence percentages for bossa nova, the tables for the other genres are included as the odd numbered tables in ap-pendix C. Section 5.2 provides an interpretation of these results.

5.1.3 Sorting on Occurrence

Sorting the data on correlation stressed the melodic properties that are in particular not shared between genres. While this serves to find very characteristic sequences, results showed

that for many sequences occurrence was low. That is why the same set of sequences is now sor-ted on occurrence, with the condition that the corresponding correlation has to be significant. Table 11 shows the occurrence sorted results for bossa nova. In the case of a sequences with neg-ative correlation, an occurrence of 0% is the op-timal result, which is why in the table the first 9 negatively correlating sequences score higher than the tenth (positively correlating) sequence. In appendix C the even numbered tables provide all occurrence sorted results.

Sequence rpb Occurrence {-6} -0.267 0% {-3, -3} -0.254 0% {9, -1} -0.213 0% {-2, 0, -3} -0.209 0% {8} -0.278 5% {-8} -0.243 5% {0, 4} -0.225 5% {0, -2, 0} -0.224 5% {-9} -0.210 10% {1} 0.322 85%

Table 11: Top 10 characteristic melodic se-quences for bossa nova in BAL sorted on highest occurrence.

5.2 Interpretation

For each genre in the BAL corpus, the results as included in table 10 and 11 and appendix C will be discussed.

The correlation sorted results show rather low scores in the occurrence column for most of the sequences, so albeit that these sequences have a strong correlation with a genre, they are not representative for most melodies of that genre. Therefore definitive conclusions about melodies in genres based on this data should be drawn with caution.

Bossa Nova No other genre scores so many sequences with negative correlation in the occur-rence sorted results as bossa nova. This is inter-esting because it indicates that melodic lines in bossa nova can be partly explained by what does

(15)

not happen in the music. Looking at the correl-ation sorted results, a lot of movement in major seconds ({-2}, {2}) is observed, especially in con-text of arpeggiation ({-2} followed by {2} or the other way around). Because these sequences of major seconds score rather high, the absence of sequence {0}, {-2}, {0} is remarkable. It indic-ates that a double note is rarely followed by a double note a minor second lower, strengthening the assumption that the major second is used extensively in arpeggiation.

Furthermore there is a complete lack of des-cending motion in augmented fourths (6}, {-3,-3}) and only a very small amount of move-ment in minor sixths ({-8}, {8}). The only res-ult that shows up in both tables is the rising minor second ({1}), indicating that this inter-val is used a lot more in this genre than in the other genres, except for tango. Tango then again stands out because it incorporates chro-matic movement, which does not seem to be a bossa nova characteristic.

The results obtained in part one indicated a lack of melodic thirds ({3} and {4}) which is in complete accordance with part two: only a single {-3} can be found in the correlation sorted res-ults, and only a single {4} can be found in the occurrence sorted results.

Mambo Both tables show very low occurrence on all features, and correlations are rather low compared to other genres as well. The sequences also lack interesting recurring interval patterns. Therefore no reliable conclusions can be drawn regarding characteristic melodic lines. It can however be argued that the mambo melodies might contain the most diverse range of pitch sequences of all the ballroom dance melodies.

Part one presented a mambo feature indic-ating melodic fifths {7}, while also noting that it would likely be unreliable. While the found interval sequences do show some ±{7} intervals, their occurrence is indeed too low to be reliable.

Merengue This genre stands out by repeat-ing notes more than the other genres: sequences of two and three repeating notes ({0}, {0, 0}) score high on both occurrence and correlation. These repeating notes are often accompanied by movement in major thirds ({4} or {-4}) and also on their own, the major thirds seem to stand out. The perfect fifth ({7}) also shows up in both tables which together with the ma-jor thirds might be indicative of movement over major chords. Finally there is a complete lack of downward chromatic motion ({-1,-1}), indicat-ing that the melodic lines in merengue are likely to move in a more tonal way: minor and major scales have no consecutive minor seconds. Rumba As well as with mambo, occurrence and correlation are rather low for rumba. Though, a large amount of movement in ascend-ing minor thirds ({3}) can be observed, as well as patterns indicative of modulation in minor seconds ({-4, 3}). This is in accordance with the results found in part one which indicated move-ment in melodic thirds ({3} and {4}). Because occurrence is so low for all sequences, absence of other characteristics is likely.

Salsa The correlation sorted results for salsa are scoring too low to be really reliable. Some sequences are however showing some interesting movement: {-4, -3, -4} is a downwards move-ment over a major seventh chord, {-4, 2, 5} is moving over a major add two chord. Incorpor-ating the occurrence sorted results, strong indic-ation of movement in minor thirds ({-3}) can be observed. Besides this movement, there is a sequence with length three with also quite high correlation ({-1, -4, 2}) which moves over a piece of the standard major or minor scale.

Opposing these quite strong characteristics are other frequently occurring sequences which are more difficult to explain in terms of logic melodic movement ({-2, 5}, {1, -3}). This could again be an indication that salsa is not so much characterized by specific pitch sequences.

(16)

Tango Of all ballroom dances, tango stands out for its extensive use of both ascending and descending minor seconds ({1}, {-1}) indicat-ive of chromatic motion, a feature also found in part one of the research. Besides, almost all se-quences use only movement in (minor and major) seconds, showing that tango lacks large melodic jumps.

When played, most of the sequences sound like they are in a minor key and have a strong tendency to lead to a certain note. Especially the sequence {-2, -2, -1} stands out as a charac-teristic tango melody, with a downwards motion over a minor scale, over which the patterns {-1, -2, 7} and {-4, 0, -1} are also moving. This might be an indication that tango melodies, at least the ones presented in this corpus, stand out by being written in minor keys.

Another result presented in part one was the absence of arpeggiation. As arpeggiation is movement back and forth between two notes, it can be represented by two interval sequences {n, −n} and {−n, n} with n any number between 0 and 12. Out of both tables, only three sequences incorporate {-1, 1}. It is thus plausible that very few arpeggiation is occurring in tango songs.

6 Conclusion

In this research, both global features and a se-lected set of small melodic sequences have been examined in order to set up a basis for under-standing musical genre through computational means. Using PCA proved useful in visualiza-tion of the data and detecting outliers, while it could not be used for actual feature selection. Hereafter a measure for determining the amount of genre clustering and mix up between genres was devised using unsupervised k-means cluster-ing which could then be used to verify correlation between genre and features.

Part two used correlation between genres and interval sequences together with occurrence of these sequences among songs to find genre spe-cific melodic structures.

Both the global features and small melodic sequences proved to be useful in character-ization of a diverse range of musical genres. Some genres, like merengue and mambo in part 1, showed much overlap with each other and seemed to be difficult to separate or characterize. This was though to be expected as some genres would be more likely characterized by other fea-tures (such as meta information).

The extensive amounts of data provided through both parts of the research require care-ful further analysis by musicologists or other researchers with a decent musical background. The small amount of analysis performed in this report however served to show that the results indeed have an understandable and clearly in-terpretable meaning, that eventually might be used to expand understanding of musical genres and define them in a more formal way.

Furthermore, as shown in the interpretation of part two, many of the global features re-lated to melody found in part one could be seen again in part two on a more detailed level. This serves to show that global features can be used in exploration of interesting characteristics while methods such as interval sequences can then be used to refine the found global features. The methods presented can thus be used in future work for exploring and refining the global fea-tures related to rhythm.

7 Discussion

Because the approach taken in this research was based on exploratory means of finding the char-acteristic features, there is much room for dis-cussing the used methods and alternative ap-proaches. This section will provide an overview of the alternatives for each part of the research as well as some general remarks, on which future work might be based.

(17)

7.1 Part 1

Inspecting PCA components In the prin-cipal component analysis presented in the first part of the research, the principal component axes are made up of a combination of all tures. It might be interesting to see which fea-tures are the most important within these axes and to see what happens to the position of the clusters when important features are removed. Though contribution of the individual features to the PC axes was shown to be small, inspect-ing a combination of important features might help in getting a better understanding of char-acteristic features within one corpus.

7.2 Part 2

The use of counts of melodic interval sequences as features yields understandable information which, as was shown, can be used to describe a genre in more detail. There are however more possibilities for extracting features while mak-ing sure that these features remain clearly inter-pretable.

Conditional chance Incorporating theory from the area of language processing, the indi-vidual pitch distances can be seen as an alphabet and sequences of these can be seen as n-grams. Because the sequences have been counted it is now possible to create an n-gram melodic se-quence model from the data and gather data about conditional interval chance. Combining this knowledge with the results obtained in this research, on can, within a genre, further inspect high scoring sequences on which way they are likely to continue.

These chance features might also be used in classification, but presumably only 2-grams can be used in this way because a single melodic line lacks enough information to give statistical ac-curate results for higher n-grams.

The second moment In cryptography the second moment of the frequency distribution of

letters in a piece of enciphered information can be used to determine in which language it was likely written. As above, the individual pitch distances can be seen as the alphabet (in the case of this research the alphabet incorporates all pitch distances from {-12} up to {12} but different values can be used). Now the second moment of the multinomial distributed intervals can be calculated using equation 1. Here fi is the frequency of the pitch interval {i} and N is the sum over all frequencies of all sequences.

S2 = P{12}

i={−12}fi(fi− 1)

N (N − 1) (1)

The outcome of this formula is describing the way pitches are distributed within melodies and thus within genres. Whenever the pitches are distributed more equal, one expects to find a lower value for S2 and whenever pitches are distributed unevenly, for instance in pentatonic scales where only 5 out of 12 pitches are fre-quently used, one expects to find a high value for S2. Such a measure might thus be indicative of the amount of tonality and notes in a scale, and could also be used for classification purposes where the initial second moments for the genres should be calculated using a large training cor-pus.

Multiple normal distributions Though cer-tain melodic sequences can be seen as genre char-acteristics because they occur more in that par-ticular genre, it is not likely that the other genres completely lack them. It is more likely that each particular melodic sequence count (each feature) has a different mean and standard deviation for each genre. Though much data is needed for ac-curate calculation, these figures then give a much clearer view of the numbers of melodic sequences in each genre. Furthermore, because these distri-butions also return a chance of a melody belong-ing to a genre given its melodic sequence counts, they might also be used for classification.

(18)

7.3 General

One Versus One Correlation In both parts of the research, the classes used for calculating the correlation of the features with a particu-lar genre were based on a one versus all genre grouping. The songs in the measured genre were said to be in class 1 and all the other songs be-longed to class 0. While this serves to find strong characteristics between one genre and all other genres it might be that other interesting genre differences arise through calculation of the cor-relation between just one genre with one other genre, or one genre with some but not all of the other genres.

Exploring alternative musical aspects The methods used in the second part of the re-search could easily be used to evaluate other fea-ture sets derived from different musical aspects. One of the very obvious aspects is using counts of different sequences of note lengths or even com-plete rhythmic patterns of melodic lines. An-other possibility would be using chord sequences which might be extracted through combining pitch information from multiple voices. On a somewhat larger scale, one could try to base counts or features on the overall structure of the song, a verse-chorus structure for example. This might also provide features describing the ’com-pressability’ of a song; in popular music there is much repetition, so the repeating parts can be compressed, while a modern classical piece is often uncompressable because it lacks repeating parts.

References

[1] Grove music online. http://www. oxfordmusiconline.com, June 2014. [2] Z. Cataltepe, Y. Yaslan, and A. Sonmez.

Music genre classification using midi and audio features. EURASIP Journal on Ap-plied Signal Processing, 1:150–150, january 2007.

[3] W. Chai and B. Vercoe. Folk music clas-sification using hidden markov models. In Proc. of International Conference on Arti-ficial Intelligence, 2001.

[4] E. Chew, A. Volk, and C.-Y. Lee. Dance music classification using inner metric ana-lysis. In B. Golden, S. Raghavan, and E. Wasil, editors, The Next Wave in Com-puting, Optimization, and Decision Techno-logies, volume 29 of Operations Research/-Computer Science Interfaces Series, pages 355–370. Springer US, 2005.

[5] D. Conklin. Melodic analysis with seg-ment classes. Machine Learning, 65:349– 360, 2006.

[6] A. Honingh, T. Weyde, and D. Conklin. Se-quential association rules in atonal music. In Proc. of Mathematics and Computation in Music (MCM2009), 2009.

[7] C. McKay. Automatic genre classification of midi recordings. Master’s thesis, McGill University, Montreal, 2004.

[8] C. Mckay and I. Fujinaga. Automatic genre classification using large high-level musical feature sets. In Int. Conf. on Music In-formation Retrieval, ISMIR, pages 525–530, 2004.

[9] D. Rizo, P. J. P. D. Len, C. Prez-sancho, A. Pertusa, and J. M. Iesta. A pattern re-cognition approach for melody track selec-tion in midi files. In Int. Conf. on Music Information Retrieval, ISMIR, pages 61–66, 2006.

[10] X. Shao, C. Xu, and M. S. Kankanhalli. Un-supervised classification of music genre us-ing hidden markov model. In IEEE Interna-tional Conference on Multimedia and Expo, pages 2023–2026, 2004.

[11] B. L. Sturm. Classification accuracy is not enough - on the evaluation of music genre

(19)

recognition systems. J. Intell. Inf. Syst., 41(3):371–406, 2013.

(20)

A

JSymbolic Features

Listing 1: JSymbolic Features

#: F e a t u r e : |#: F e a t u r e : 1 D u r a t i o n |53 N u m b e r of U n p i t c h e d I n s t r u m e n t s 2 A c o u s t i c G u i t a r F r a c t i o n |54 O r c h e s t r a l S t r i n g s F r a c t i o n 3 A m o u n t of A r p e g g i a t i o n |55 O v e r a l l D y n a m i c R a n g e 4 A v e r a g e M e l o d i c I n t e r v a l |56 P e r c u s s i o n P r e v a l e n c e 5 A v e r a g e N o t e D u r a t i o n |57 P i t c h C l a s s V a r i e t y 6 A v e r a g e N o t e To N o t e D y n a m i c s C h a n g e |58 P i t c h V a r i e t y 7 A v e r a g e N u m b e r of I n d e p e n d e n t V o i c e s |59 P o l y r h y t h m s 8 A v e r a g e R a n g e of G l i s s a n d o s |60 P r i m a r y R e g i s t e r 9 A v e r a g e T i m e B e t w e e n A t t a c k s |61 Q u a l i t y 10 A v e r a g e T i m e B e t w e e n A t t a c k s For E a c h V o i c e |62 Q u i n t u p l e M e t e r 11 A v e r a g e V a r i a b i l i t y of T i m e B e t w e e n A t t a c k s For E a c h V o i c e |63 R a n g e 12 B r a s s F r a c t i o n |64 R a n g e of H i g h e s t L i n e 13 C h a n g e s of M e t e r |65 R e l a t i v e N o t e D e n s i t y of H i g h e s t L i n e 14 C h r o m a t i c M o t i o n |66 R e l a t i v e R a n g e of L o u d e s t V o i c e 15 C o m b i n e d S t r e n g t h of Two S t r o n g e s t R h y t h m i c P u l s e s |67 R e l a t i v e S t r e n g t h of M o s t C o m m o n I n t e r v a l s 16 C o m p o u n d Or S i m p l e M e t e r |68 R e l a t i v e S t r e n g t h of Top P i t c h C l a s s e s 17 D i r e c t i o n of M o t i o n |69 R e l a t i v e S t r e n g t h of Top P i t c h e s 18 D i s t a n c e B e t w e e n M o s t C o m m o n M e l o d i c I n t e r v a l s |70 R e p e a t e d N o t e s 19 D o m i n a n t S p r e a d |71 R h y t h m i c L o o s e n e s s 20 D u r a t i o n of M e l o d i c A r c s |72 R h y t h m i c V a r i a b i l i t y 21 E l e c t r i c G u i t a r F r a c t i o n |73 S a x o p h o n e F r a c t i o n 22 E l e c t r i c I n s t r u m e n t F r a c t i o n |74 S e c o n d S t r o n g e s t R h y t h m i c P u l s e 23 G l i s s a n d o P r e v a l e n c e |75 S i z e of M e l o d i c A r c s 24 H a r m o n i c i t y of Two S t r o n g e s t R h y t h m i c P u l s e s |76 S t a c c a t o I n c i d e n c e 25 I m p o r t a n c e of B a s s R e g i s t e r |77 S t e p w i s e M o t i o n 26 I m p o r t a n c e of H i g h R e g i s t e r |78 S t r e n g t h of S e c o n d S t r o n g e s t R h y t h m i c P u l s e 27 I m p o r t a n c e of L o u d e s t V o i c e |79 S t r e n g t h of S t r o n g e s t R h y t h m i c P u l s e 28 I m p o r t a n c e of M i d d l e R e g i s t e r |80 S t r e n g t h R a t i o of Two S t r o n g e s t R h y t h m i c P u l s e s 29 I n i t i a l T e m p o |81 S t r i n g E n s e m b l e F r a c t i o n 30 I n t e r v a l B e t w e e n S t r o n g e s t P i t c h C l a s s e s |82 S t r i n g K e y b o a r d F r a c t i o n 31 I n t e r v a l B e t w e e n S t r o n g e s t P i t c h e s |83 S t r o n g e s t R h y t h m i c P u l s e 32 M a x i m u m N o t e D u r a t i o n |84 S t r o n g T o n a l C e n t r e s 33 M a x i m u m N u m b e r of I n d e p e n d e n t V o i c e s |85 T r i p l e M e t e r 34 M e l o d i c F i f t h s |86 V a r i a b i l i t y of N o t e D u r a t i o n 35 M e l o d i c I n t e r v a l s in L o w e s t L i n e |87 V a r i a b i l i t y of N o t e P r e v a l e n c e of P i t c h e d I n s t r u m e n t s 36 M e l o d i c O c t a v e s |88 V a r i a b i l i t y of N o t e P r e v a l e n c e of U n p i t c h e d I n s t r u m e n t s 37 M e l o d i c T h i r d s |89 V a r i a b i l i t y of N u m b e r of I n d e p e n d e n t V o i c e s 38 M e l o d i c T r i t o n e s |90 V a r i a b i l i t y of T i m e B e t w e e n A t t a c k s 39 M i n i m u m N o t e D u r a t i o n |91 V a r i a t i o n of D y n a m i c s 40 M o s t C o m m o n M e l o d i c I n t e r v a l |92 V a r i a t i o n of D y n a m i c s In E a c h V o i c e 41 M o s t C o m m o n M e l o d i c I n t e r v a l P r e v a l e n c e |93 V i b r a t o P r e v a l e n c e 42 M o s t C o m m o n P i t c h C l a s s |94 V i o l i n F r a c t i o n 43 M o s t C o m m o n P i t c h C l a s s P r e v a l e n c e |95 V o i c e E q u a l i t y - D y n a m i c s 44 M o s t C o m m o n P i t c h |96 V o i c e E q u a l i t y - M e l o d i c L e a p s 45 M o s t C o m m o n P i t c h P r e v a l e n c e |97 V o i c e E q u a l i t y - N o t e D u r a t i o n 46 N o t e D e n s i t y |98 V o i c e E q u a l i t y - N u m b e r of N o t e s 47 N u m b e r of C o m m o n M e l o d i c I n t e r v a l s |99 V o i c e E q u a l i t y - R a n g e 48 N u m b e r of C o m m o n P i t c h e s | 1 0 0 V o i c e S e p a r a t i o n 49 N u m b e r of M o d e r a t e P u l s e s | 1 0 1 W o o d w i n d s F r a c t i o n 50 N u m b e r of P i t c h e d I n s t r u m e n t s | 51 N u m b e r of R e l a t i v e l y S t r o n g P u l s e s | 52 N u m b e r of S t r o n g P u l s e s |

(21)

B

Results Part 1

−10 −8 −6 −4 −2 0 2 4 6 8 10 −8 −6 −4 −2 0 2 4 6 8 Principal Component 1 Principal Component 2 Hardcore Rap Punk Trad. Country

Figure 8: The ’Popular’ sublevel of BOD corpus in PC space.

−8 −6 −4 −2 0 2 4 6 8 −8 −6 −4 −2 0 2 4 6 8 10 Principal Component 1 Principal Component 2 Bebop Jazz Soul Swing

Figure 9: The ’Jazz’ sublevel of BOD corpus in PC space.

−8 −6 −4 −2 0 2 4 6 8 10 −6 −4 −2 0 2 4 6 8 10 Principal Component 1 Principal Component 2 Baroque Modern Classical Romantic

(22)

Feature rpb

40. Most Common Melodic Interval 0.491

5. Average Note Duration -0.364

47. Number of Common Melodic Intervals 0.355

12. Brass Fraction 0.348

79. Strength of Strongest Rhythmic Pulse -0.342 15. Comb. Str. of Two Str. Rythm Pulses -0.334 78. Str. of Second Strongest Rhythmic Pulse -0.319

34. Melodic Fifths 0.315

9. Average Time Between Attacks -0.315

7. Average Number of Independent Voices -0.308

Table 12: Characteristic features for Mambo in BAL together with the clustering scores.

Feature rpb

98. Voice Equality - Number of Notes 0.720

51. Number of Relatively Strong Pulses 0.642

46. Note Density 0.628

88. Variability of Note Prevalence of Unpitched Instruments 0.564

5. Average Note Duration -0.559

43. Most Common Pitch Class Prevalence 0.535

87. Variability of Note Prevalence of Pitched Instruments 0.522

48. Number of Common Pitches 0.520

2. Acoustic Guitar Fraction -0.518

12. Brass Fraction 0.513 Genre F1 Bossa Nova 0.707 Mambo 0.465 Merengue 0.759 Rumba 0.245 Tango 0.594

Table 13: Characteristic features for Merengue in BAL together with the clustering scores.

Feature rpb

22. Electric Instrument Fraction 0.457

28. Importance of Middle Register 0.419

37. Melodic Thirds 0.373

26. Importance of High Register -0.350

51. Number of Relatively Strong Pulses -0.347

21. Electric Guitar Fraction 0.343

27. Importance of Loudest Voice 0.292

20. Duration of Melodic Arcs 0.286

6. Average Note To Note Dynamics Change 0.275

41. Most Common Melodic Interval Prevalence -0.273

(23)

Feature rpb

56. Percussion Prevalence -0.896

81. String Ensemble Fraction 0.696

14. Chromatic Motion 0.629

58. Pitch Variety 0.597

53. Number of Unpitched Instruments -0.586

59. Polyrhythms 0.573

88. Variability of Note Prevalence of Unpitched Instruments -0.564

3. Amount of Arpeggiation -0.537

99. Voice Equality - Range 0.516

54. Orchestral Strings Fraction 0.501

Table 15: Characteristic features for Tango in BAL together with the clustering scores.

Feature rpb

48. Nr. of common pitches 0.706

58. Pitch variety -0.694

22. Electric instrument fraction 0.672

57. Pitch class variety -0.625

43. Most common pitch class prevalence 0.609

21. Electric guitar fraction 0.604

45. Most common pitch prevalence 0.581

60. Primary Register -0.486

64. Range of highest line -0.476

3. Amount of arpeggiation 0.476

Genre F1

Popular 0.819

Jazz 0.645

Classical 0.000

Table 16: Characteristic features for Popular in the BOD top level with the clustering scores.

Feature rpb

53. Number of unpitched instruments 0.615

72. Rythmic variability -0.580

73. Saxophone fraction 0.567

12. Brass fraction 0.564

52. Number of strong pulses -0.542

78. Strength of second strongest rythmic pulse -0.537

38. Melodic tritones 0.536

15. Combined strenth of two strongest rythmic pulses -0.500

58. Pitch variety 0.499 29. Initial tempo 0.439 Genre F1 Popular 0.482 Jazz 0.809 Classical 0.364

Table 17: Characteristic features for Jazz in the BOD top level with the clustering scores.

Feature rpb

56. Percussion prevalence -0.799

53. Number of unpitched instruments -0.732

88. Variation of note prevalence of unpitched instruments -0.720

60. Primary register 0.612

26. Imporance of high register 0.527

46. Note Density -0.506

25. Importance of bass register -0.486

9. Average time between attacks 0.446

82. String keyboard fraction 0.437

22. Electric instrument fraction -0.433

Genre F1

Popular 0.556

Jazz 0.696

Classical 0.930

(24)

Feature rpb 1. Duration 0.718 46. Note Density -0.657 56. Percusion Prevalence 0.647 77. Stepwise motion 0.629 14. Chromatic motion 0.569

7. Average number of independent voices -0.528

21. Electric guitar fraction -0.513

84. Strong tonal centres 0.512

26. Importance of high register 0.507

Genre F1

Hardcore Rap 0.886

Punk 0.755

Traditional Country 0.545

Table 19: Characteristic features for Hardcore Rap in the BOD popular subgenre with the clustering scores.

Feature rpb

22. Electric instrument fraction 0.808

21. Electric guitar fraction 0.749

3. Amount of arpeggiation 0.724

18. Distance between most common melodic intervals 0.705

70. Repeated notes 0.686

37. Melodic thirds -0.633

55. Overal dynamic range -0.607

46. Note density 0.584

1. Duration -0.575

25. Importance of bass register 0.570

Genre F1

Hardcore Rap 0.693

Punk 0.983

Table 20: Characteristic features for Punk in the BOD popular subgenre with the clustering scores.

Feature rpb

2. Acoustic guitar fraction 0.826

58. Pitch variety 0.722

41. Most common melodic interval prevalence -0.706

57. Pitch class variety 0.658

28. Importance of middle register 0.638

37. Melodic thirds 0.621

33. Maximum number of independent voices 0.596 7. Average number of independent voices 0.561

70. Repeated notes 0.559

Genre F1

Hardcore Rap 0.642

Punk 0.815

Table 21: Characteristic features for Traditional Country in the BOD popular subgenre with the clustering scores.

(25)

Feature rpb

29. Initial tempo 0.600

12. Brass fraction -0.569

76. Staccato Incidence 0.552

50. Number of pitched instruments -0.520

73. Saxophone fraction -0.520

33. Maximum number of independent voices -0.492

90. Variability of time between attacks -0.485

95. Voice Equality - Dynamics 0.476

10. Average time between attacks for each voice -0.461 89. Variability of number of independent voices -0.454

Genre F1

Bebop 0.785

Jazz Soul 0.651

Swing 0.910

Table 22: Characteristic features for Bebop in the BOD jazz subgenre with the clustering scores.

Feature rpb

43. Most common pitchclass prevalence 0.533

68. Relative strength of top pitch classes -0.487

52. Number of strong pulses 0.400

64. Range of highest line 0.398

55. Overall dynamic range -0.384

89. Variability of number of independent voices -0.377

7. Average number of independent voices -0.368

29. Initial tempo -0.367

26. Importance of high register 0.360

Genre F1

Bebop 0.687

Jazz Soul 0.702

Swing 0.852

Table 23: Characteristic features for Jazz Soul in the BOD jazz subgenre with the clustering scores.

Feature rpb

12. Brass fraction 0.874

73. Saxophone fraction 0.856

33. Maximum number of independent notes 0.844

89. Variability of number of independent voices 0.831

7. Average number of independent voices 0.812

50. Number of pitched instruments 0.609

5. Average note duration 0.508

86. Variability of note duration 0.503

Genre F1

Bebop 0.696

Jazz Soul 0.525

Swing 0.922

(26)

Feature rpb

63. Range -0.659

58. Pitch variety -0.599

19. Dominant spread 0.535

38. Melodic tritones -0.514

41. Most common melodic interval prevalence 0.496 92. Variation of dynamics in each voice 0.483

57. Pitch class variety -0.462

55. Overall dynamic range -0.445

64. Range of highest line -0.391

Genre F1

Baroque 0.800

Romantic 0.521

Table 25: Characteristic features for Baroque in the BOD classical subgenre with the clustering scores.

Feature rpb

13. Change of meter 0.538

24. Harmonicity of two strongest rythmic pulses 0.435

19. Combinations strength of two strongest rythmic pulses -0.329

57. Pitch class variety 0.323

63. Range 0.291

86. Variability of note duration 0.288

94. Violin fraction -0.287

62. Quintuple meter 0.284

54. Orchestral strings fraction -0.270

43. Most common pitch class prevalence -0.269

Genre F1

Baroque 0.710

Romantic 0.419

Table 26: Characteristic features for Modern Classical in the BOD classical subgenre with the clustering scores.

Feature rpb

1. Duration 0.442

72. Rhythmic variability -0.433

51. Number of relative strong pulses 0.425

98. Voice equality - number of notes 0.407

15. Combined strength of two strongest rhythmic pulses -0.399

63. Range 0.390

37. Melodic thirds 0.388

71. Rhythmic looseness 0.383

58. Pitch variety 0.379

78. Strenth of second strongest rythmic pulse -0.379

Genre F1

Baroque 0.630

Romantic 0.607

Table 27: Characteristic features for Romantic in the BOD classical subgenre with the clustering scores.

(27)

C

Results Part 2

Sequence rpb Occurrence {7,3} 0.408 23% {-5,5,4} 0.393 23% {-3,12} 0.391 23% {4,-9} 0.383 23% {3,-12} 0.372 30% {3,2,4} 0.366 15% {-12,-2} 0.366 15% {-12,-5} 0.366 15% {0,-4,-3} 0.357 15% {-7,-2,-1} 0.348 15%

Table 28: Top 10 characteristic melodic sequences for mambo in BAL sorted on correlation value. Sequence rpb Occurrence {3, 2} 0.245 54% {-2, -2, 0} 0.311 46% {-3, 0, 3} 0.235 38 % {3, -12} 0.372 31 % {7, 3} 0.408 23 % {-5, 5, 4} 0.393 23 % {-3, 12} 0.391 23 % {4, -9} 0.383 23 % {4, -12} 0.348 23 % {-7, 12} 0.306 23 %

Table 29: Top 10 characteristic melodic sequences for mambo in BAL sorted on highest occurrence. Sequence rpb Occurrence {0} 0.498 100% {0,4} 0.462 73% {-4,0} 0.451 66% {-4} 0.450 93% {0,0} 0.432 87% {-4,0,4} 0.426 40% {0,-4} 0.419 46% {-3,-4,0} 0.412 26% {-4,7} 0.411 26% {7} 0.399 93%

Table 30: Top 10 characteristic melodic sequences for merengue in BAL sorted on correlation value. Sequence rpb Occurrence {-1, -1} -0.222 0% {0} 0.498 100% {4} 0.389 100% {-5} 0.327 100% {-4} 0.451 93% {7} 0.399 93% {5} 0.233 93% {0, 0} 0.432 87% {-2, 0} 0.354 87% {-7} 0.399 80%

Table 31: Top 10 characteristic melodic sequences for merengue in BAL sorted on highest occurrence. Sequence rpb Occurrence {-4,3,-4} 0.431 22% {3,-2,-5} 0.415 33% {4,5,4} 0.371 17% {-5,7,1} 0.371 17% {3,3,-2} 0.371 17% {3,1,-5} 0.371 17% {3,-4,3} 0.363 22% {-3,5,3} 0.363 22% {-4,3} 0.356 33% {3,4,3} 0.351 28%

Table 32: Top 10 characteristic melodic sequences for rumba in BAL sorted on correlation value. Sequence rpb Occurrence {2, 1, -1} 0.242 61% {1, 4} 0.231 44% {3, -2, -5} 0.415 33% {-4, 3} 0.356 33% {3, 4, 3} 0.351 28% {-1, 5, 0} 0.347 28% {-5, -7} 0.324 28% {5, -2, 2} 0.261 28% {7, -7} 0.238 28% {3, 0, 2} 0.261 28%

Table 33: Top 10 characteristic melodic sequences for rumba in BAL sorted on highest occurrence.

(28)

Sequence rpb Occurrence {-5,-2,5} 0.482 25% {-2,1,3} 0.482 25% {-5,-1,-2} 0.457 25% {-4,-3,-4} 0.457 25% {-4,2,5} 0.441 38% {-4,-8} 0.426 25% {-3,-3,-2} 0.425 25% {3,-1,-4} 0.420 38% {5,1} 0.399 25% {5,-3} 0.398 50%

Table 34: Top 10 characteristic melodic sequences for salsa in BAL sorted on correlation value. Sequence rpb Occurrence {-3} 0.259 100% {-2, 5} 0.315 75% {3, -3} 0.280 75% {5, -1} 0.260 63% {-2, 4} 0.248 63% {1, -3} 0.237 63% {-3, -2} 0.215 63% {5, -3} 0.398 50% {-1, -4, 2} 0.372 50% {-3, 2} 0.335 50%

Table 35: Top 10 characteristic melodic sequences for salsa in BAL sorted on highest occurrence. Sequence rpb Occurrence {0,1,1} 0.475 33% {-1,1,1} 0.450 40% {-2,-1} 0.448 100% {1,1,1} 0.439 66% {-1,-2,7} 0.417 53% {-1,3,-1} 0.415 20% {-3,0,-1} 0.415 20% {1,2,-4} 0.415 20% {-4,0,-1} 0.411 27% {-1,-1,1} 0.402 47%

Table 36: Top 10 characteristic melodic sequences for tango in BAL sorted on correlation value. Sequence rpb Occurrence {-2, -1} 0.448 100% {-1, 1} 0.374 100% {-1} 0.347 100% {1} 0.344 100% {0, -1} 0.217 100% {2, 1} 0.258 93% {1, 2} 0.352 87% {-2, -2, -1} 0.300 87% {-1, 3} 0.243 80% {1, 1} 0.359 73%

Table 37: Top 10 characteristic melodic sequences for tango in BAL sorted on highest occurrence.

A Basis for Characterizing Musical Genres