Intelligent Systems for Video Analysis and Access Over the Internet

From the Book: PREFACE The explosion of on-line web information has given rise to many query-based textsearch engines (such as Alta Vista) and manually constructed topic hierarchies (suchas Yahoo!). With the current growth rate of web information, especially broadbandmultimedia data, query data are growing incomprehensibly large and manual classification in topic hierarchies is creating a major bottleneck. Consequently, the hugeamount of multimedia data is imposing on people a heavy burden of manipulating, searching, interpreting, skimming, and integrating information. Thus, efficientmultimedia content analysis tools are needed to address these user's needs. This book presents a solution to problems arising from the demand for fastinformation access and for sharing in real-time multimedia transmission over theInternet. We present in this book a solution which exploits software agents thatare placed throughout the network environment. These hierarchical video analysisagents process multimedia streams in real time, and automatically decompose andunderstand the multimedia content so as to facilitate information access and sharing. Multimedia content contains both the perceptual content such as color, motion,or acoustic features and the conceptual content, which is specified based on conceptsor semantics that can be expressed by text descriptions. Both types of contents areembedded simultaneously in multimedia streams, and usually are complementaryto each other. This book adaptively analyzes both kinds of video contents bycombining mixed media cues from audio, video and text. First, a high-performance module for on-line video segmentation based on scene-change detection isdescribed. The module serves as the first step of any videostream construction and analysis. To meet the high computational demand, ourproposed video scene change detection algorithms are very efficient while maintaining high accuracy and recall rates for fast on-line video analysis. Second, the perceptual features of audio and video data are analyzed in abottom-up manner and integrated so as to discriminate among the different eventsin any video stream effectively. An efficient decision-tree learning algorithm is usedto induce a set of if-then rules which link perceptual features with the video conceptual semantic contents. These rules not only serve as a video classifier, but alsoguide on-line real-time video/audio feature extraction and data redistribution. Anovel knowledge-based system, where knowledge is stored as learned rules, is proposed and described in this book to serve as a video semantic inference/classificationengine. Third, we present our proposed hierarchical video categorization scheme basedon machine learning of the text information contained in a video—a scheme whichprovides a good complement to the video/audio classification subsystem. Thelearned text features for each video category are also stored in the knowledge base.To fuse the text classifier and the audio/video classifier, a media cue optimizer thatis trained by using the cue probability distribution based on the concept hierarchyis adopted to guide real-time media query and analysis. The integration of hierarchical video analysis, clustering and classification allows a large amount of multimedia data to be organized and presented to usersin an individualized and comprehensible way. A general hierarchical concept treescheme is used to organize a video into a table-of-contents for video applications andenables a comprehensive agent-based solution to real-time multimedia distributionand sharing over the Internet.