Bridging the semantic gap with computational media aesthetics

C ontent processing and analysis research in multimedia systems has one central objective: develop technologies that help sift and easily access useful nuggets of information from media data streams. A fundamental need exists to analyze, cull, and categorize information automatically and systematically from media data and to manage and exploit it effectively despite rapidly accumulating digital media collections. However, user expectations of such systems are far from being met, despite continued research for nearly a decade. Currently, only simple, generic, low-level content metadata is made available from analysis. This metadata isn’t always useful because it deals primarily with representing the perceived content rather than the semantics of it. In the last few years, we’ve seen much attention given to the semantic gap problem in automatic content annotation systems. The semantic gap is the gulf between the rich meaning and interpretation that users expect systems to associate with their queries for searching and browsing media and the shallow, lowlevel features (content descriptions) that the systems actually compute. For more information on this dilemma, see Smeulders et al.,1 who discuss the problem at length and lament that while “the user seeks semantic similarity, the database can only provide similarity on data processing.”

[1]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Svetha Venkatesh,et al.  Computational Media Aesthetics: Finding Meaning Beautiful , 2001, IEEE Multim..