Multi-pitch estimation for polyphonic musical signals

Automatic score transcription goal is to achieve an score-like (notes pitches through time) representation from musical signals. Reliable pitch extraction methods for monophonic signals exist, but polyphonic signals are much more difficult, often ambiguous, to analyze. We propose a computationally efficient technique for automatic recognition of notes from a polyphonic signal. It looks for correctly shaped (magnitude and phase wise) peaks in a, time and frequency oversampled, multiscale decomposition of the signal. Peaks (partial candidates) get accepted/discarded by their match to the window spectrum shape and continuity-across-scale constraints. The final partial list builds a resharpened and equalized spectrum. Note candidates are found by searching for harmonic patterns. Perceptual and source based rejection criteria help discard false notes, frame-by-frame. Slightly non-causal postprocessing uses continuity (across a <150 ms observation time) to kill too short notes, fill in the gaps, and correct (sub)octave jumps.