Assessing Optical Music Recognition Tools

68 Computer Music Journal As digitization and information technologies advance, document analysis and optical-characterrecognition technologies have become more widely used. Optical Music Recognition (OMR), also commonly known as OCR (Optical Character Recognition) for Music, was first attempted in the 1960s (Pruslin 1966). Standard OCR techniques cannot be used in music-score recognition, because music notation has a two-dimensional structure. In a staff, the horizontal position denotes different durations of notes, and the vertical position defines the height of the note (Roth 1994). Models for nonmusical OCR assessment have been proposed and largely used (Kanai et al. 1995; Ventzislav 2003). An ideal system that could reliably read and “understand” music notation could be used in music production for educational and entertainment applications. OMR is typically used today to accelerate the conversion from image music sheets into a symbolic music representation that can be manipulated, thus creating new and revised music editions. Other applications use OMR systems for educational purposes (e.g., IMUTUS; see www.exodus.gr/imutus), generating customized versions of music exercises. A different use involves the extraction of symbolic music representations to be used as incipits or as descriptors in music databases and related retrieval systems (Byrd 2001). OMR systems can be classified on the basis of the granularity chosen to recognize the music score’s symbols. The architecture of an OMR system is tightly related to the methods used for symbol extraction, segmentation, and recognition. Generally, the music-notation recognition process can be divided into four main phases: (1) the segmentation of the score image to detect and extract symbols; (2) the recognition of symbols; (3) the reconstruction of music information; and (4) the construction of the symbolic music notation model to represent the information (Bellini, Bruno, and Nesi 2004). Music notation may present very complex constructs and several styles. This problem has been recently addressed by the MUSICNETWORK and Motion Picture Experts Group (MPEG) in their work on Symbolic Music Representation (www .interactivemusicnetwork.org/mpeg-ahg). Many music-notation symbols exist, and they can be combined in different ways to realize several complex configurations, often without using well-defined formatting rules (Ross 1970; Heussenstamm 1987). Despite various research systems for OMR (e.g., Prerau 1970; Tojo and Aoyama 1982; Rumelhart, Hinton, and McClelland 1986; Fujinaga 1988, 1996; Carter 1989, 1994; Kato and Inokuchi 1990; Kobayakawa 1993; Selfridge-Field 1993; Ng and Boyle 1994, 1996; Couasnon and Camillerapp 1995; Bainbridge and Bell 1996, 2003; Modayur 1996; Cooper, Ng, and Boyle 1997; Bellini and Nesi 2001; McPherson 2002; Bruno 2003; Byrd 2006) as well as commercially available products, optical music recognition—and more generally speaking, music recognition—is a research field affected by many open problems. The meaning of “music recognition” changes depending on the kind of applications and goals (Blostein and Carter 1992): audio generation from a musical score, music indexing and searching in a library database, music analysis, automatic transcription of a music score into parts, transcoding a score into interchange data formats, etc. For such applications, we must employ common tools to provide answers to questions such as “What does a particular percentagerecognition rate that is claimed by this particular algorithm really mean?” and “May I invoke a common methodology to compare different OMR tools on the basis of my music?” As mentioned in Blostein and Carter (1992) and Miyao and Haralick (2000), there is no standard for expressing the results of the OMR process. Assessing Optical Music Recognition Tools

[1]  George Nagy,et al.  Automated Evaluation of OCR Zoning , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Timothy C. Bell,et al.  A music notation construction engine for optical music recognition , 2003, Softw. Pract. Exp..

[3]  Nicholas Paul Carter Automatic recognition of printed music in the context of electronic publishing , 1989 .

[4]  Michael Kassler,et al.  Optical Character-Recognition of Printed Music: A Review of Two Dissertations@@@Automatic Recognition of Sheet Music@@@Computer Pattern Recognition of Standard Engraved Music Notation , 1972 .

[5]  Martin Roth An approach to recognition of printed music , 1994 .

[6]  Pierfrancesco Bellini,et al.  Optical music sheet segmentation , 2001, Proceedings First International Conference on WEB Delivering of Music. WEDELMUSIC 2001.

[7]  Tatsu Kobayakawa Auto music score recognizing system , 1993, Electronic Imaging.

[8]  Kazuhiko Yamamoto,et al.  Structured Document Image Analysis , 1992, Springer Berlin Heidelberg.

[9]  Michael Good,et al.  MusicXML for notation and analysis , 2001 .

[10]  Paolo Nesi,et al.  Effort estimation and prediction of object-oriented systems , 1998, J. Syst. Softw..

[11]  Seiji Inokuchi,et al.  A Recognition System for Printed Piano Music Using Musical Knowledge and Constraints , 1992 .

[12]  Geoffrey E. Hinton,et al.  A general framework for parallel distributed processing , 1986 .

[13]  Ivan Bruno,et al.  An Off-Line Optical Music Sheet Recognition , 2004 .

[14]  Eleanor Selfridge-Field,et al.  Beyond MIDI: the handbook of musical codes , 1997 .

[15]  George Heussenstamm The Norton manual of music notation , 1987 .

[16]  Kim C. Ng,et al.  Recognition and reconstruction of primitives in music scores , 1996, Image Vis. Comput..

[17]  Susan Ella George,et al.  Visual Perception of Music Notation: On-Line and Off-Line Recognition , 2004 .

[18]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[19]  David Bainbridge,et al.  Extensible optical music recognition , 1997 .

[20]  David Cooper,et al.  MIDI extensions for musical notation (2): expressive MIDI , 1997 .

[21]  Nicholas P. Carter,et al.  Automatic Recognition of Printed Music , 1992 .

[22]  Ichlro FuJinaga,et al.  Optical Music Recognition Using Projections , 1988 .

[23]  Pierfrancesco Bellini,et al.  WEDELMUSIC format: an XML music notation format for emerging applications , 2001, Proceedings First International Conference on WEB Delivering of Music. WEDELMUSIC 2001.

[24]  Ventzislav Alexandrov Error evaluation and applicability of OCR systems , 2003, CompSysTech '03.

[25]  John R. McPherson Introducing Feedback into an Optical Music Recogniition System , 2002, ISMIR.

[26]  Donald Byrd Music-notation searching and digital libraries , 2001, JCDL '01.

[27]  C. Ng,et al.  Reconstruction of Music Scores from Primitive Subsegmentation , 1994 .

[28]  Dorothea Blostein,et al.  Recognition of Music Notation: SSPR’90 Working Group Report , 1992 .

[29]  Ted Ross The Art of Music Engraving and Processing , 1970 .

[30]  Paolo Nesi,et al.  Estimation and Prediction Metrics for Adaptive Maintenance Effort of Object-Oriented Systems , 2001, IEEE Trans. Software Eng..

[31]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[32]  Bruce W. Pennycook,et al.  Adaptive optical music recognition , 1997 .

[33]  Jean Camillerapp,et al.  A way to separate knowledge from program in structured document analysis: application to optical music recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.