Feature Invariance Samples

For each song, we render the piano reduction of the MIDI file for the 15 second clip, and then compute MFCC and chroma features on the audio. After computing the features, we then synthesize audio from the features. Chromagram features are extracted and reconstructed using Dan Ellis’ chroma features analysis and synthesis code and MFCCs using his rastamat library . The MFCC reconstructions sound like a pitched noise source, and the chroma reconstructions have an ethereal ‘warbly’ quality to them but sound more like the original audio than the MFCC reconstruction.



Don't Bother Me by The Beatles

MIDI version
Chroma Reconstruction
MFCC Reconstruction
Come Together by The Beatles

MIDI version
Chroma Reconstruction
MFCC Reconstruction

Relevant Work:


  • Schmidt, E. M., Prockup, M., Scott, J., Dolhansky, B., Morton, B. and Kim, Y. E. (2012). Relating perceptual and feature space invariances in music emotion recognition. Proceedings of the International Symposium on Computer Music Modeling and Retrieval, London, U.K.: CMMR. Best Student Paper. [PDF] [Oral Presentation]