Knowledge Mining and Visualization on News Webpages and Large-Scale News Video Database

The traditional layout of news websites, the combination of classified hierarchical browsing, headline recommendation and keyword-based search, has been used for many years. The keyword-based search is considered to be the most powerful tool for news browsing and retrieval. Unfortunately, the keyword-based query formulation technique is very difficult to use for news audiences because of the mismatch between its requirements and the audience's ability. It requires the audiences are able to provide appropriate search keywords to represent their information needs. But the news audiences may have difficulty completing this simple task for some reasons: (1) The audiences may not have experience of video retrieval thus they do not know how to represent their information needs via keywords. (2) The audiences may even have no clear idea of what they need beforehand because the news contents are completely dynamic and may be unpredictable. (3) Many visual concepts are difficult to represent via keywords thus the problem is more severe for video news databases. In this paper a novel news browsing and retrieval framework is proposed to resolve this problem. Unlike the traditional news websites, the proposed framework indexes news webpages and video news reports via information extracted from the database and presents the information via visualization techniques as the browsing and retrieval interface. The visualization based interface can directly represent valuable information to the users. The users can browse the content of the news database efficiently and submit queries visually via the visualization interface even they cannot provide appropriate search keywords.