Ganesh Srinivas

The Query By Singing/Humming (QBSH) problem in Music Information Retrieval

The Query By Singing/Humming problem is the task of retrieving music from a given melodic or rhythmic input (in audio or symbolic format), extract features, compare them to those in the database and retrieve the correct song. A system which solves this problem would need to be robust to poor humming (wrong pitch, wrong note during, wrong key) as well as background noise. The most common techniques used by existing solutions are hidden Markov models, melodic contour (Dynamic Time Warping to align the pitch contour of the query with that of the target), ngram, note interval matching (treating melodies as strings and using dynamic programming to align two strings).

QBSH is one task in the field of Music Information Retrieval (MIR). There are other fascinating tasks/problems within MIR such as audio fingerprinting, genre classification, etc.

I am interested in understanding existing techniques to solve this problem, and in finding new techniques too. In fact, I'm doing an undergrad research project. My advisor is Dr. N Sukumar, Director of the Center for Informatics at SNU. My proposal is titled Designing a robust and scalable Query By Humming system by using a sequence of wavelet transform coefficients and Levenshtein edit distance..

I will maintain this webpage as a reading list for students, researchers and practitioners interested in the QBSH problem. If you have noticed any incorrect information here, or if you have any suggestions, please email me.

Eugene Weinstein. Query by Humming: A Survey. NYU Computer Science. 2005. https://cs.nyu.edu/~eugenew/publications/hummingsummary.pdf
MIREX (Music Information Retrieval Evaluation Exchange) 2015: Query By Singing/Humming. http://www.music-ir.org/mirex/wiki/2015:Query_by_Singing/Humming
Schedl, Gomez, Urbano. Music Information Retrieval: Recent Developments and Applications. 2014.