Gesture Recognition for Conducting Computer Music

Research Day Poster April 22, 2008
By: David Grunberg
Advisor: Dr. Youngmoo Kim

Problem


Gestures are an intuitive and expressive method of communication. If computers could recognize them, many tasks could be simplified and made more efficient. We are particularly interested in using gestures to control computerized music as a conductor would conduct an orchestra. Currently, algorithms cannot reliably identify abstract gestures such as those used by orchestral conductors.


Algorithm


Gestures are made with a Wiimote, a wireless controller for the Nintendo Wii gaming system. The Wiimote contains digital accelerometers which allow us to digitize and record the accelerations produced in gesturing. These accelerations are recorded by the computer, split into individual gestures by hand, and statistically modeled. The gestures we modeled included a variety of simple shapes, but we eventually focused on lines and semicircles, the basic vocabulary of conducting. We used data from multiple people to create more robust models.


Figure 1: One person’s set of south-north clockwise and north-south counterclockwise gestures.




Figure 2: A Wiimote with labeled axes.

Modeling Gestures


We chose Hidden Markov Models (HMMs) to model our gestures. HMMs represent an event as a series of observations produced by various time-ordered states. We use them here because:

  • HMMs model events in time.
  • HMMs model events probabilistically.
  • It is possible to determine the most likely model for a given observation set.

We chose to use 10 state HMMs to model our gestures.



Figure 3: An example 5 state HMM for the right to left line gesture.
Camille Troillard. http://www.osculator.net/wiki/Main/FAQ

Results


Percent of test gestures correctly classified:

  • 1 person, 25 gestures: 88.8%.
  • 9 people, 16 gestures: 37.4%.
  • Future work will focus on improving accuracy by adjusting the number of states in the HMMs, reviewing our method for separating gestures and training models, and collecting data from additional users.

    References


    • Lee, Christopher, and Yangsheng Xu. “Hidden Markov Models for Interactive Learning of Hand Gestures.” Carnegie Mellon University. 1996. http://www.cs.cmu.edu/afs/cs/project/space/www/hmm/hmm.html.
      • Rabiner, Lawrence R. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.” Proceedings of the IEEE. Vol. 77, No. 2. February, 1989.
        • Yang, Jie, and Yangsheng Xu. “Hidden Markov Model for Gesture Recognition.” The Robotics Institute, Carnegie Mellon University. May 1994.