Online Activities for Music Information and Acoustics Education and Psychoacoustic Data Collection

International Conference on Music Information Retrieval (ISMIR)
September 14-18, 2008 in Philadelphia, PA
By: Travis M. Doll, Ray V. Migneco and Youngmoo E. Kim


Online collaborative activities[1] provide a powerful platform for the collection of psychoacoustic data on the perception of audio and music from a very large number of subjects. Furthermore, these activities can be designed to simultaneously educate users about aspects of music information and acoustics, particularly younger students in grades K-12. We have created prototype interactive activities illustrating aspects of two different sound and acoustics concepts: musical instrument timbre and the cocktail party problem[2] (sound source isolation within mixtures). These activities also provide a method of collecting perceptual data related to these problems with a range of parameter variation that is difficult to achieve for large subject populations using traditional psychoacoustic evaluation. We present preliminary data from a pilot study where middle school students were engaged with the two activities to demonstrate the potential benefits as a platform for education and data collection.

Timbre Game

Timbre Modifier:

  • Analyzes timbre of real musical instruments with a time and frequency representation [3]
  • Interface allows players to alter an instrument’s timbre by drawing amplitude curves and dragging harmonic weights

Timbre Listener:

  • Players listen to instruments created by other players to determine if they can identify the source instrument
  • Evaluates player’s ability to detect instruments subject to amplitude and spectral modification

Cocktail Party Game

Room Creation:

  • Provides a visual representation of the acoustic environment, including the listener and source locations
  • Illustrates the effects of reverberation and interfering sounds on the sources in the room
  • Listening Room:

    • Players listen to mixture of voices in a configuration that was created by other players to determine if the speaker of interest is in the room
    • Evaluates the ability of a listener to detect a known person's voice within a mixture of voices in a room


    Figure 3: Players’ accuracy for correctly identifying instruments and families versus Signal to Noise Ratios.

    Figure 4: Players’ accuracy for correctly detecting the presence of the speaker of interest versus Signal to Interference plus Noise Ratio.

    Both games were tested at a music magnet school with 56 8th grade students under the following conditions:

    • Students worked individually using headphones
    • Given 20 minute sessions with each game component
    • Over 350 sounds were created for each game
    • Over 800 listening trials obtained from each game
    • Generally better detection accuracy with higher SNR for speech, less so for musical instrucments

    Research Objectives

    In these applications we seek to collect human evaluation data pertaining to perceptual thresholds on a large scale. In particular, with the Timbre Game we can:

    • Utilize the creation component to find the threshold at which an instrument’s timbre can be altered while maintaining the identity of the instrument
    • Examine the results of the listening component to evaluate human ability to identify an instrument with modified timbre
      • Similarily, for the Cocktail Party Game, we can:
        • Obtain thresholds for speech intelligibility as the player adds more voices to the mixture in the creation component
        • Find thresholds of human ability to perceive if a speaker is present in a mixture with interfering voices

        With the data collected we hope to learn more about the features of audio most relevant to human auditory perception.

        Future Work

        In the future, we plan to improve these applications by:

        • Serving our applications to the internet in order to collect data from a more diverse subject population
        • Allowing players to record their own sounds for immediate use in the games

        Furthermore, we wish to pursue a detailed analysis of acquired performance data for cases that deviate from anticipated “difficulty” in terms of SNR.


        [1] L. von Ahn, “Games with a purpose,” Computer, vol. 39, no. 6, pp. 92-94, 2006.
        [2] L. J. Stifelman, “The cocktail party effect in auditory interfaces: a study of simultaneous presentation,” in MIT Media Laboratory Technical Report, 1994.
        [3] P. Iverson and C. L. Krumhansl, “Isolating the dynamic attributes of musical timbre,” in Journal of Acoustic Society of America, vol. 94, no. 5, 1993, pp. 2585-2603.