Review of Arbil: Free Tool for Creating, Editing, and Searching Metadata

1. OVERVIEW.1 Arbil2 (Withers 2012) is a free Java-based software program for creating and editing metadata. It was created by Peter Withers, of The Language Archive at the Max Planck Institute for Psycholinguistics, for use within the DoBeS3 (Dokumentation bedrohter Sprachen, ‘Documentation of Endangered Languages’) program. Arbil is now being maintained by a team led by Peter Withers and Twan Goosen, who are developing and extending it for a wider user group. The creation of metadata is an important part of any project that collects data. In this review, I will focus on language description and documentation style projects and the kinds of linguistic and other data that they tend to collect. Metadata is information that describes the content of other data. Metadata usually describes the type of data, for instance whether it is a video recording, a written text, or a photo, and the format of the data, for instance MPEG1 or MPEG2. It should also describe when and where the data was collected. There should also be a good description of all the people involved. For instance, if you are describing a transcription you need to know who is speaking in the recording, who made the recording, and who did the transcription. For more information about linguistic metadata see Bowern (2008:56-59) and Thieberger & Berez (2012). Metadata makes it possible for community members and other researchers to find out what kind of data you have and to find material they are looking for. Metadata is also necessary for yourself as a researcher, to keep a lasting record of the details of your data and allow you to locate recordings and information for many years to come. I am very glad that I have good metadata for recordings I made five years ago. Without it I would be completely lost, looking for elicitation sessions on a particular topic or the name of the woman who told that story about the dog. Arbil is not only a good tool for the initial creation of metadata, it can be used later to search your metadata and open the associated files directly. There are several different formats for linguistic metadata in use. Arbil is intended for the creation of metadata in either the ISLE MetaData Initiative (IMDI)4 or the Component MetaData Infrastructure (CMDI)5 format. Both formats are XML-based templates for metadata. They provide a list of fields, some of which are obligatory and must be filled in in order to have a completed metadata file, and others which are optional. This both orders and constrains the type of metadata created. This constraint on the structure of the