Efficient Acoustic Feature Computation Using FPGAs


    Many recent advances in music information retrieval (MIR) have been data-driven. Widespread performance evaluations on common data sets, like the annual MIREX events, have been instrumental in advancing the field. Such endeavors incur large computational costs and could potentially benefit from faster calculation of acoustic features. Traditional cluster-based solutions are expensive and space- and power-inefficient. The massively parallel architecture of the field programmable gate array (FPGA) makes it possible to design lower-cost, application-specific chips rivaling cluster speed for large-scale acoustic feature computation. Such devices also show potential for implementations of MIR systems on embedded devices where hardware acceleration is a necessity.

    Design and Prototyping

      We have designed a prototype Xilinx System Generator (XSG) library for acoustic feature calculation to run on the XUPV2P board from Digilent, Inc. We have chosen the XUPV2P board because it is capable of performing very intense computations despite its low cost (with academic pricing).

      XSG makes rapid prototyping of hardware designs possible even without extensive hardware description language(HDL) knowledge. Designs are created graphically and XSG generates HDL to run on the hardware. An example XSG implementation of spectral flux is shown below:

      Simulation Results

        XSG Hardware co-simulation has shown acoustic features computed on the XUPV2P to be as accurate as those computed with M2K and MATLAB, with nearly negligible differences, when tested in an automated genre classification task. Computation time for a single Fast Fourier Transform (FFT), the first step of all the acoustic feature computations attempted, is over an order of magnitude faster than both of the afforementioned tools.

        Toward a Hardware Prototype

          XSG is capable of exporting block diagrams, like the one shown above, into the Xilinx EDK to be implemented as part of an embedded system. We are working towards communicating with a host computer via TCP/IP over an ethernet cable. Work in progress will result in a system that receives audio data from MATLAB, computes acoustic features, and transmits the features back to MATLAB for analysis.


          • Schmidt, E. M., West, K., and Kim, Y. E. (2009). Efficient acoustic feature extraction for music information retrieval using programmable gate arrays. Proceedings of the 2009 International Society for Music Information Retrieval Conference, Kobe, Japan: ISMIR. [PDF]