Erik Schmidt

eschmidt [at] pandora [dot] com

Education:


  • PhD Electrical Engineering, Drexel University 2012
  • MS Electrical Engineering, Drexel University 2009
  • BS Electrical Engineering, Temple University 2007


Bio:


I am currently a Senior Scientist on the playlist team at Pandora. Prior to Pandora I was a Post-Doctoral Researcher in the Music and Entertainment Technology Laboratory (MET-lab) at Drexel University. My primary research interests are in machine learning and digital signal processing, and their application to music and audio. Following the explosion of vast and easily-accessible digital music libraries over the past decade, there has been a rapid expansion of research towards automated systems for searching and organizing music and related data. Especially in the case of online digital music retailers, with libraries reaching well into the millions of songs, human-based organization and recommendation is no longer feasible, and powerful automated tools are necessary.

My PhD dissertation focused on Modeling and Predicting Emotion in Music. As the medium of music has evolved specifically for the expression of emotions, the most natural organization of music is in terms of emotional associations. In my research I seek to develop automated tools to perform this task driven entirely by computer-based audio analysis and machine learning. A very important consideration in this work is that musical emotion is inherently non-static: it evolves naturally over time in synchrony with the music. It is because of this dynamic relationship that systems capable of predicting musical emotion must model not just the relationships between acoustic data and emotion parameters, but how they evolve over time. The ultimate outcome of this work could provide automated tools which offer a user not just the ability to search for music of a specific emotion, but also music that moves between multiple specific emotions. Additionally, such a system would provide a model of the process of deriving human emotions from an acoustic signal, which could potentially provide a new understanding of how these emotions are formed.

Before coming to Drexel, I received the BSEE degree from Temple University in 2007. In my undergraduate career worked as a tutor in signal processing and at Aviom, Inc., a company in the market of audio networking solutions and embedded audio system design.


Research Interests:


  • Machine Learning
  • Digital Signal Processing
  • Music Information Retrieval


Research Projects:


    Music Emotion Recognition


    In developing automated systems to recognize the emotional content of music, we are faced with a problem spanning two disparate domains: the space of human emotions and the acoustic signal of music. To address this problem, we must develop models for both data collected from humans describing their perceptions of musical mood and quantitative features derived from the audio signal.

    Theses:


    • Schmidt, E. M. (2012). Modeling and Predicting Emotion in Music. Unpublished Ph.D. Thesis, Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA. [PDF]

    • Publications:


      • Schmidt, E. M and Kim, Y. E. (2013). Learning rhythm and melody features with deep belief networks. Proceedings of the 14th International Society for Music Information Retrieval Conference. Curitiba, Brazil. [PDF]

      • Prockup, M., Schmidt, E., Scott, J. & Kim, Y. (2013). Toward understanding expressive percussion through content based analysis. Proceedings of the 14th International Society for Music Information Retrieval Conference. Curitiba, Brazil. [PDF]

      • Soleymani, M., Caro, M., Schmidt, E. M., Sha, C., and Yang, Y. H. (2013). 1000 songs for emotional analysis of music. Proceedings of the ACM Multimedia 2013 Workshop on Crowdsourcing for Multimedia. Barcelona, Catalunya, Spain. [PDF]

      • Schmidt, E. M., Prockup, M., Scott, J., Dolhansky, B., Morton, B. G., and Kim, Y. E. (2013). Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition. Computer Music Modeling and Retrieval. Music and Emotions.

      • Anglade, A., Humphrey, E., Schmidt, E., Stober, S., and Sordo, S. (2013). Demos and Late Breaking Session of the Thirteenth International Society for Music Information Retrieval Conference (ISMIR 2012). Computer Music Journal. [PDF]

      • Grunberg, D. K., Batula, A. M., Schmidt, E. M. and Kim, Y. E. (2012). Affective Gesturing with Music Mood Recognition. Proceedings of the International Conference on Humanoid Robotics. Osaka, Japan: Humanoids. [PDF]

      • Grunberg, D. K., Batula, A. M., Schmidt, E. M. and Kim, Y. E. (2012). Synthetic Emotions for Humanoids: Perceptual Effects of Size and Number of Robot Platforms. Journal of Synthetic Emotions: Special Issue on Music, Robots, and Emotion (invited paper). [PDF]

      • Schmidt, E. M., Scott, J., and Kim, Y. E. (2012). Feature Learning in Dynamic Environments: Modeling the Acoustic Structure of Musical Emotion. Proceedings of the 2012 International Society for Music Information Retrieval Conference, Porto, Portugal: ISMIR. [PDF]

      • Schmidt, E. M., Prockup, M., Scott, J., Dolhansky, B., Morton, B. and Kim, Y. E. (2012). Relating perceptual and feature space invariances in music emotion recognition. Proceedings of the International Symposium on Computer Music Modeling and Retrieval, London, U.K.: CMMR. Best Student Paper. [PDF] [Oral Presentation]

      • Scott, J., Schmidt, E. M., Prockup, M., Morton, B. and Kim, Y. E. (2012). Predicting time-varying musical emotion distributions from multi-track audio. Proceedings of the International Symposium on Computer Music Modeling and Retrieval, London, U.K.: CMMR. [PDF]

      • Batula, A. M., Morton, B. G., Migneco, R., Prockup, M., Schmidt, E. M., Grunberg, D. K., Kim, Y. E., and Fontecchio, A. K. (2012). Music Technology as an Introduction to STEM. Proceedings of the 2012 ASEE Annual Conference, San Antonio, Texas: ASEE. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2012). Modeling and Predicting Emotion in Music. Music, Mind, and Invention Workshop, Ewing, NJ: MMI. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2011). Modeling the acoustic structure of musical emotion with deep belief networks. NIPS Workshop on Music and Machine Learning, Sierra Nevada, Spain: NIPS-MML. [Oral Presentation]

      • Schmidt, E. M. and Kim, Y. E. (2011). Modeling musical emotion dynamics with conditional random fields. Proceedings of the 2011 International Society for Music Information Retrieval Conference, Miami, Florida: ISMIR. [PDF]

      • Speck, J. A., Schmidt, E. M., Morton, B. G., and Kim, Y. E. (2011). A comparative study of collaborative vs. traditional annotation methods. Proceedings of the 2011 International Society for Music Information Retrieval Conference, Miami, Florida: ISMIR. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2011). Learning emotion-based acoustic features with deep belief networks. Proceedings of the 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY: WASPAA. [PDF]

      • Schmidt, E. M., Migneco, R. V., Scott, J. J. and Kim, Y. E. (2011). Modeling instrument tones as dynamic textures. Proceedings of the 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY: WASPAA. [PDF]

      • Scott, J., Prockup, M., Schmidt, E. M., Kim, Y. E. (2011). Automatic Multi-Track Mixing Using Linear Dynamical Systems. Proceedings of the 8th Sound and Music Computing Conference, Padova, Italy: SMC. [PDF]

      • Kim, Y. E., Batula, A. M., Migneco, R., Richardson, P., Dolhansky, B., Grunberg, D., Morton, B., Prockup, M., Schmidt, E. M., and Scott, J. (2011). Teaching STEM concepts through music technology and DSP. Proceedings of the 14th IEEE Digital Signal Processing Workshop and 6th IEEE Signal Processing Education Workshop, Sedona, AZ: DSP/SPE. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2010). Prediction of time-varying musical mood distributions using Kalman filtering. Proceedings of the 2010 IEEE International Conference on Machine Learning and Applications, Washington, D.C.: ICMLA. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2010). Prediction of time-varying musical mood distributions from audio. Proceedings of the 2010 International Society for Music Information Retrieval Conference, Utrecht, Netherlands: ISMIR. [PDF]

      • Kim, Y. E., Schmidt, E. M., Migneco, R., Morton, B. G., Richardson, P., Scott, J., Speck, J. A. and Turnbull, D. (2010). Music emotion recognition: a state of the art review. Proceedings of the 2010 International Society for Music Information Retrieval Conference, Utrecht, Netherlands: ISMIR. [PDF]

      • Morton, B. G., Speck, J. A., Schmidt, E. M., and Kim, Y. E. (2010). Improving music emotion labeling using human computation. Proceedings of the ACM SIGKDD Workshop on Human Computation, Washington, D.C.: HCOMP [PDF]

      • Schmidt, E. M., Turnbull, D., and Kim, Y. E. (2010). Feature selection for content-based, time-varying musical emotion regression. Proc. ACM SIGMM International Conference on Multimedia Information Retrieval, Philadelphia, PA. [PDF]

      • Schmidt, E. M., West, K., and Kim, Y. E. (2009). Efficient acoustic feature extraction for music information retrieval using programmable gate arrays. Proceedings of the 2009 International Society for Music Information Retrieval Conference, Kobe, Japan: ISMIR. [PDF]

      • Schmidt, E. M. and Kim, Y. E. (2009). Projection of acoustic features to continuous valence-arousal mood labels via regression. Accepted to the 2009 International Society for Music Information Retrieval Conference, Kobe, Japan: ISMIR. [PDF]

      • Kim, Y. E., Schimdt, E., and Emelle, L. (2008). Moodswings: a collaborative game for music mood label collection. Proceedings of the 2008 International Conference on Music Information Retrieval, Philadelphia, PA: ISMIR. [PDF]