Social Tagging and Dublin Core: a proposal of new metadata elements deriving from Folksonomies
暂无分享,去创建一个
Abstract: The Web 2.0 maximizes the Internet concept of encouraging its users to cooperate effectively for the offer of virtual services and content organization. Among the various potentialities of the Web 2.0, folksonomy appears as a result of the free attribution of tags to the Web's resources by the user himself. Folksonomies describe the Web's resources; however, they aren't integrated in the metadata in general. In order for them to be intelligible by machines and therefore used in the Semantics Web context, they have to be automatically allocated to specific metadata elements. There are many metadata patterns. The focus of this investigation will be the Dublin Core (DC) which is a gathering of metadata for the description of electronic resources and which has been adopted by the Institutional Repositories as a way of standardization and interoperability. We propose an investigation which intends to identify elements of the metadata originated from folksonomies and integrate them in a DC Ontology extended so as to allow that the values reported in the tags may be conveniently gathered by the protocol for metadata harvesting, specifically the Open Archives Initiative - Protocol for Metadata Harvesting (OAI-PMH). This paper will present the results of the pilot study developed in the beginning of the investigation as well as the metadata elements preliminarily defined. Metadata may be defined as a group of elements for the description of resources [1]. There are many standards of metadata, however, in the repository context; we can point out the Dublin Core Metadata Element Set (DCMES) or simply Dublin Core (DC) which is a metadata pattern for the description of electronic resources. This standard is well diffused, used globally and on a broad scale due to some factors: a) it was created specifically for the description of electronic elements; b) it has an initiative which is responsible for its development, maintenance and spreading, the Dublin Core Metadata Initiative (DCMI); c) it is the group of metadata used for the protocol Open Archives Initiative - Protocol for Metadata Harvesting (OAI-PMH), a mechanism for data transfer between digital repositories. The insertion of metadata in repositories may be done by the authors themselves, the professionals who mediate the deposit or of the final users. The more active participation of the users in the construction and organization of Internet contents is the result of the evolution of the technologies used in the Web, the so-called Web 2.0, it is 'the network as platform, spanning all connected devices; Web 2.0 applications are those that make the most of the intrinsic advantages of that platform: delivering software as a continually-updated service that gets better the more people use it, consuming and remixing data from multiple sources, including individual users, while providing their own data and services in a form that allows remixing by others, creating network effects through an "architecture of participation," and going beyond the page metaphor of Web 1.0 to deliver rich user experiences.' [2]. Among the new possibilities of the Web 2.0 Folksonomy comes up as "the result of personal free tagging of information and objects (anything with an URL) for one's own retrieval. The tagging is done in a social environment (shared and open to others). The act of tagging is done by the person consuming the information" [3]. The tags which make up a folksonomy would be key-words, categories or metadata [4]. In this brief definition of tag, we can notice that when attributed by the users they can represent different roles. In a study [5][6] the following roles are pointed out: Identifying What (or Who) it is About, Identifying What it Is, Identifying Who Owns It, Refining Categories, Identifying Qualities or Characteristics, Self Reference and Task Organizing. In another study, the Kinds of Tags (KoT) , which compared tags with DC metadata elements, the authors observed that there are some tags which cannot be inserted in any of the already existing elements and therefore, concluded that other metadata elements may be defined in order to include descriptions arising from folksonomies. Some probable new elements which were identified: Action_Towards_Resource, To_Be_Used_In, Rate e Depth [7][8]. The KoT is being developed in partnership with the following universities: Universidade do Minho (Portugal), University of Bologna (Italy), UKOLN (United Kingdom), Universidad Carlos III (Spain), La Trobe University (Australia) and Universit? Libr? de Bruxelles -Facult? de Philosophie et Lettres (Belgium) and has the objective of verifying how the tags derived from folksonomies can be normalized aiming at their interoperability with metadata standards, specifically the DC. Summing up, metadata are groups of elements for the description of digital resources, holding different standards, among them the DC which is adopted by the Repositories as the basis for the protocol for metadata harvesting (o OAI-PMH). In the Web 2.0 context, folksonomies arise, which are the result of Web resource tagging by its own users. Tags are a complementary form of description which expresses the user's view of the resource being used. It can be observed through the preliminary results of the KoT project that the current elements of description defined in the DCMI Metadata Terms do not include all the descriptive elements attributed by resource users by means of these tags. In the context shown, giving continuity to the analysis resulting from the KoT project, we propose an investigation which aims at identifying metadata elements derived from folksonomies and integrate them in a DC Ontology extended so as to enable that the values reported in the tags may be conveniently gathered by the protocol for metadata harvesting. Being so, we intend to develop a qualitative approach research and answer the following questions: Q1 - Which metadata elements are necessary to contain the folksonomy values?; Q2 - Which new metadata elements should be created and which is their relation with the already existing ones in the Dublin Core?; Q3 - Which codification schemes should be used and what is their relation to the already recommended by DCMI?; Q4 - Which Ontology related to the DC already enable access to the previously established conceptualizations? Q5 - Accomplishing what is stipulated in the DCAM, what is the extension of the DC Ontology which should be made available openly? The procedures are divided in four stages: 1) Analysing tags contained in the KoT project dataset- at this stage we will analyse all tags in relation to the resources to which they have been attributed. Complementarily, to settle doubts, it will be necessary to turn to lexical resources (dictionaries, encyclopaedias, Word Net, Wikipedia, etc) and to analyse tags in relation to its users to understand the functionality of the tag attributed as a metadata element. At this stage a pilot study will be developed to refine the methodology proposed to verify if the variants proposed for grouping and analysing tags are adequate to identify the probable "new metadata elements" which could be extracted from folksonomies. 2) Propose complementary metadata elements to the DC - Establishing description elements originated from folksonomies based on the DC standard, the DCAM model, the ISO Standard 15836-2003 and NISO Standard Z39.85-2007 norms. At this stage we intend to propose new elements and/or qualifiers complementary to the DC. 3) Forming an Ontology - Here we intend to fulfil the Integration of DC Ontologies with the new elements and/or qualifiers derived from folksonomies. The ontology will be created from the Prot?g? tool and coded in OWL. 4) Validation of the proposal - carried out by the scientific community as the methodology and results obtained will be presented in relevant events and scientific magazines and by the DCMI Social Tagging community through investigations via online questionnaires and workshops proposed to the community. It is intended that the results of the research may provide support so that the applications based on Artificial Intelligence permit the automation of the harvesting processes including the description provided from folksonomies. This paper will present the results of the pilot study (that is being finalized) alongside with the preliminary results of the first research stage: tag analysis. This stage will be done in the following phases: a) Analysis and grouping of tags in their variant forms; b) Analysis of tags in relation to the DC metadata elements and its qualifiers. The preliminary results of KoT point to the possible proposal of some new metadata elements or element refinements to DCMI. Those terms will potentially accommodate tags that currently do not have a metadata holder. The results of this research will therefore allow to determinate if the KoT preliminary findings are verified and in which extension. The final paper will conclude with this discussion.