Using iDocument for Document Categorization in Nepomuk Social Semantic Desktop

On the Semantic Desktop users maintain their model of the world in a formal personal information model ontology. Concepts from this ontology are used to annotate documents from desktop, allowing efficient navigation and browsing of these. However, the mental overhead required for correctly classifying new incoming document is substantial. We present the integration of the ontology-based information extraction system iDocument into the Nepomuk Semantic Desktop for classifying documents within the personal information model. A comparison is done between iDocument and the original classification system Structure Recommender. It is based on real models and documents from five Nepomuk users. Results reveal evidences that iDocument’s categorization proposals are rated with higher recall and precision values and show that iDocument’s result ranking corresponds to user ratings.