Mixing Low-Level and Semantic Features for Image Interpretation - A Framework and a Simple Case Study

Semantic Content-Based Image Retrieval (SCBIR) allows users to retrieve images via complex expressions of some ontological language describing a domain of interest. SCBIR adds some flexibility to the state-of-the-art methods for image retrieval, which support query either by keywords or by image examples. The price for this additional flexibility is the generation of a semantically rich description of the image content reflecting the ontology constraints. Generating these semantic interpretations is an open research problem. This paper contributes to this research line by proposing an approach for SCBIR based on the somehow natural idea that the interpretation of a picture is an (onto) logical model of an ontology that describes the domain of the picture. We implement this idea in an unsupervised method that jointly exploits the ontological constraints and the low-level features of the image. The preliminary evaluation, presented in the paper, shows promising results.

[1]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[2]  Isabelle Bloch,et al.  Explanatory Reasoning for Image Understanding Using Formal Concept Analysis and Description Logics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[3]  Jerry R. Hobbs,et al.  Interpretation as Abduction , 1993, Artif. Intell..

[4]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[5]  Luc De Raedt,et al.  A Relational Distance-based Framework for Hierarchical Image Understanding , 2012, ICPRAM.

[6]  Céline Hudelot,et al.  Towards ontologies for image interpretation and annotation , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[8]  Bernd Neumann,et al.  On the logics of image interpretation: model-construction in a formal knowledge-representation framework , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[9]  Ralf Möller,et al.  Logical Formalization of Multimedia Interpretation , 2011, Knowledge-Driven Multimedia Information Extraction and Ontology Evolution.

[10]  Bozena Staruch,et al.  First Order Theories for Partial Models , 2005, Stud Logica.

[11]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[12]  R. Moller,et al.  Towards computer vision with description logics: some recent progress , 1999, Proceedings Integration of Speech and Image Understanding.

[13]  Franz Baader Description Logics , 2009, Reasoning Web.

[14]  Luc De Raedt,et al.  A Relational Kernel-Based Framework for Hierarchical Image Understanding , 2012, SSPR/SPR.

[15]  Marco Gori,et al.  Bridging logic and kernel machines , 2011, Machine Learning.

[16]  Raymond Reiter,et al.  A Logical Framework for Depiction and Image Interpretation , 1989, Artif. Intell..

[17]  De Xu,et al.  Concept vector for semantic similarity and relatedness based on WordNet structure , 2012, J. Syst. Softw..

[18]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[19]  Emilly Budlong Multimedia Information Extraction , 2007 .

[20]  Michael G. Strintzis,et al.  Applying Fuzzy DLs in the Extraction of Image Semantics , 2009, J. Data Semant..

[21]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[22]  Ralf Möller,et al.  Formalizing Multimedia Interpretation based on Abduction over Description Logic Aboxes , 2009, Description Logics.

[23]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[24]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[25]  Umberto Straccia,et al.  Reasoning within Fuzzy Description Logics , 2011, J. Artif. Intell. Res..

[26]  Yvan Saeys,et al.  Java-ML: A Machine Learning Library , 2009, J. Mach. Learn. Res..

[27]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[28]  Isabelle Bloch,et al.  Fuzzy spatial relation ontology for image interpretation , 2008, Fuzzy Sets Syst..

[29]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30]  C. Ehrenfels,et al.  Foundations of Gestalt Theory , 1988 .

[31]  Bernd Neumann,et al.  Navigating through Logic-Based Scene Models for High-Level Scene Interpretations , 2003, ICVS.

[32]  Volker Haarslev,et al.  The RacerPro knowledge representation and reasoning system , 2012, Semantic Web.

[33]  Bernd Neumann,et al.  On scene interpretation with description logics , 2006, Image Vis. Comput..

[34]  Atilla Baskurt,et al.  Image understanding and scene models: a generic framework integrating domain knowledge and Gestalt theory , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..