Representing Music Using XML

Why does the world need another music representation language? Beyond MIDI describes over 20 different languages or musical codes (Selfridge-Field, 1997). Most commercial music programs have their own internal, proprietary music representation and file format. Music's complexity has led to this proliferation of languages and formats. Sequencers, notation programs, analysis tools, and retrieval tools all need musical information optimized in different ways. Yet no music interchange language has been widely adopted since MIDI. MIDI has contributed to enormous growth in the electronic music industry, but has many well-known limitations for notation, analysis, and retrieval. These include its lack of representation of musical concepts such as rests and enharmonic pitches (distinguishing Db from C#), as well as notation concepts such as stem direction and beaming. Other interchange formats such as NIFF and SMDL overcome these restrictions, but have not been widely adopted. Successful interchange formats such as MIDI and HTML share a common trait that NIFF and SMDL lack. MIDI and HTML skillfully balance simplicity and power. They are simple enough for many people to learn, and powerful enough for many real-world applications. The simplicity makes it easy for software developers to implement the standards and to develop encoding tools for musicians. This helps circumvent the “chicken-and-egg” problem with new formats. XML (Extensible Markup Language) is a World Wide Web Consortium (W3C) recommendation for representing structured data in text, designed for ease of usage over the Internet by a wide variety of applications. XML is a meta-markup language that lets designers and communities develop their own representation languages for different applications. Like HTML and MIDI, it balances simplicity and power in a way that has made it very attractive to software developers. The common base of XML technology lets developers of new languages focus on representation issues instead of low-level software development. All XML-based languages can be processed by a variety of XML tools available from multiple vendors. Since XML files are text files, users of XML files always have generic text-based tools available as a lowest common denominator. XML documents are represented in Unicode, providing support for international score exchange. MusicXML is an XML-based music interchange language. It represents common western musical notation from the 17 century onwards, including both classical and popular music. The language is designed to be extensible to future coverage of early music and less standard 20 and 21 century scores. Non-western musical notations would use a separate XML language. As an interchange language, it is designed to be sufficient, not optimal, for diverse musical applications. MusicXML is not intended to supersede other languages that are optimized for specific musical applications, but to support sharing of musical data between applications. The current MusicXML software runs on Windows. As of September 2000, it reads 100% of the MuseData format plus portions of NIFF and Finale’s Enigma Transportable Files (ETF). It writes to Standard MIDI Files in Format 1, MuseData files, and Sibelius. The NIFF, ETF, and MIDI converters use XML versions of these languages as intermediate structures. MusicXML is defined using an XML Document Type Definition (DTD) at www.musicxml.com/xml.html. XML Schemas address some shortcomings of DTDs, but are not yet a W3C recommendation. MusicXML adapts the MuseData and Humdrum languages to XML, adding features needed to cover more of 19-21 century musical usage. These were chosen as starting points because they are two of the most powerful languages currently available for musical analysis and interchange. One of Humdrum’s important features is its explicitly two-dimensional representation of music by part and by time. A hierarchical representation like XML cannot directly support this type of lattice structure, but programs written in XSLT (Extensible Style Language Transformations) support automatic conversion between these two orderings. MusicXML score files do not represent presentation concepts such as pages and systems. The details of formatting will change based on different paper and display sizes. In the XML environment, formatting is handled separately from structure and semantics. The same applies for detailed interpretive performance information. One limitation to computer-based musical analysis and retrieval has been the tight coupling of representations to development tools (e.g. Humdrum requires Unix familiarity; MuseData tools require TenX). In contrast, XML programming tools are available for all major industry programming languages and platforms. This lets the user rather than the representation language choose the programming environment, making for simpler development of musical applications. Say we want to investigate whether Bach’s pieces really have 90% of its notes in one of two durations—e.g., quarters and eighths, or eighths and sixteenths. We can do this by plotting a distribution of note durations on a bar chart, displayed together with a simple spreadsheet. (The full poster includes a picture.) Writing this program in Visual Basic took only half a day, including learning to use the display controls. In the 2 movement of Bach’s Cantata No. 6, for example, the top two note durations make up nearly 87% of the notes, a more uneven distribution than often seen with other composers. For retrieval purposes, an extended program could then look for the works in a given corpus with the most uneven distribution of note durations. Music information retrieval faces a tower-of-Babel problem. There is no representation language in widespread use today that overcomes MIDI's limitations for music interchange. Past efforts suffered from the overall absence of popular standardized formats for complex structured data. XML provides the technical foundation for a more powerful and expressive music interchange language. Developing converters between existing formats and a single music XML language could greatly simplify the tasks of music information retrieval. MusicXML attempts to provide an interchange language that is well designed from musical, human, and computer perspectives.