Theoretical principles concerning segmentation, labelling strategies and levels of categorical annotation for spoken language database systems