Web mining for open source intelligence is the retrieval, extraction and analysis of information from on-line Internet sites. There are two separate applications areas this paper will review, namely live news-monitoring and targeted topic based data mining. Most newspapers and news agencies have Web sites with live updates on unfolding events, opinions and perspectives on world events. Most governments monitor news reports to feel the pulse of public opinion, and for early warning of emerging crises. The Joint Research Centre has developed significant experience in Internet content monitoring through its work on media monitoring (EMM) for the European Commission. EMM forms the core of the Commission's daily press monitoring service. Intelligence services and law enforcement agencies also require specific site monitoring and topic monitoring, and EMM technology has been applied to the wider Internet for this purpose. The software extracts and downloads all the textual content from monitored sites and applies information extraction techniques. These tools help analysts process large amounts of documents to derive structured data. Lastly the visualisation of the extracted data is important for analysts to identify patterns and trends derived from both news reports and Web mining.
[1]
Bruno Pouliquen,et al.
Exploiting multilingual nomenclatures and language-independent text features as an interlingua for cross-lingual text analysis applications
,
2006,
ArXiv.
[2]
C. Best,et al.
Mapping World Events
,
2005
.
[3]
Bruno Pouliquen,et al.
Navigating multilingual news collections using automatically extracted information
,
2005,
27th International Conference on Information Technology Interfaces, 2005..
[4]
Tanev Hristo.
Unsupervised Learning of Social Networks from a Multiple-Source News Corpus
,
2007
.
[5]
Jakub Piskorski,et al.
Extracting Violent Events From On-Line News for Ontology Population
,
2007,
BIS.
[6]
Steinberger Ralf,et al.
Automatic Detection of Quotations in Multilingual News
,
2007
.
[7]
Bruno Pouliquen,et al.
Multilingual person name recognition and transliteration
,
2005,
ArXiv.
[8]
Bruno Pouliquen,et al.
Geocoding Multilingual Texts: Recognition, Disambiguation and Visualisation
,
2006,
LREC.