Musical expression is the creative nuance through which a musician conveys emotion and connects with a listener. In this work, we present a system that seeks to classify different expressive articulation techniques independent of percussion instrument. One use of this system is to enhance the organization of large percussion sample libraries, which can be cumbersome and daunting to navigate. This work is also a necessary first step towards understanding musical expression as it relates to percussion performance. The ability to classify expressive techniques can lead to the development of models that learn the the functionality of articulations in patterns, as well as how certain performers use them to communicate and define their musical style.
Figure 1. Articulations are performance techniques that arise from creative variations of dynamics and instrument timbre.
Time Varying Features
In expressive performance, the evolution of timbre over time is an important component on both a micro and macro level. This work investigates expression at the micro level by attempting to model the evolution of percussion articulations. Using the sequential evolution of features derived from time domain and frequency domain components of the signal, a set of classifiers is trained to predict percussion articulations within subsets containing only individual drums (only snare, only rack tom, etc.) as well as within the superset of all drum samples.
Figure 2. Articulations and 6th order polynomials fit to feature evolution.
Table 1. Accuracy of articulation classification on each drum individually. Results are shown for the top five performing features on each drum.
Table 2. Accuracy of articulation classification over the superset of all drum types for features individually and in aggregation.
In addition to just articulation, the same experiments were performed across all expressive dimensions of the dataset. These include articulation, intensity, stick height, and strike position. The plot in Figure 3 shows results for classifying each expressive dimension across the superset of all drums.
Figure 3. Classification using the best feature evolution, MFCCs, and the aggregation of the two across each expressive dimension.
In Figure 4 the classification of all expressive parameters simultaneously across all drums is shown. The soft metric is a weighted accuracy that treats each of the expressive dimensions individually (an example can be 75% correct if 3 out of the 4 attributes are correct). The hard metric is the accuracy of all attributes for each example being predicted correctly. All 4 attributes must be correct in order to achieve a positive result.
Figure 4. Classification of all expressive parameters across all drums using the best feature evolution, MFCCs, and the aggregation of the two.
 L. Mion and G. D. Poli, “Score-independent audio features for description of music expression,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 2, pp. 458–466, 2008.
 M. Goldenberg, Modern school for snare drum. Hal Leonard, 1955.
 A. Tindale, A. Kapur, G. Tzanetakis, and I. Fujinaga, “Retrieval of percussion gestures using timbre classification techniques,” in Proceedings of the International Conference on Music Information Retrieval, pp. 541– 544, 2004.