WIA: a web inspection architecture

With the ever-increasing infiltration of the internet into everyday affairs of human life, the monitoring and control of social behaviours of users for the purpose of proper management of society has proved prudent. But, most existing approaches to social behaviour analysis are static and fall short of considering the varieties and differences in cultures and localities. We propose a dynamic approach to extract localised users' favourite websites through logging the URLs that were accessed by users in places such as universities and government institutions; then by categorising the content of logged websites, our categorised users' favourite websites were created dynamically. We evaluated our approach in a real setting by dynamically building up a database of users' favourite websites in a six months period operating in the ICT Ministry of Iran. Comparison with famous static URL databases showed the superiority of our approach in catching newly published websites, making our approach more durable in performance.

[1]  Rung Ching Chen,et al.  A pornographic web page detecting method based on SVM model using text and image features , 2006, Int. J. Internet Protoc. Technol..

[2]  Leonid Zhukov,et al.  Methods for MiningWeb Communities: Bibliometric, Spectral, and Flow , 2004, Web Dynamics.

[3]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[4]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[5]  Uday V. Kulkarni,et al.  The hybrid web personalised recommendation based on web usage mining , 2010, Int. J. Data Min. Model. Manag..

[6]  Haibo Wang,et al.  Applying a Novel Combined Classifier for Hypertext Classification in Pornographic Web Filtering , 2008, 2008 International Conference on Internet Computing in Science and Engineering.

[7]  Javed Mostafa,et al.  Topic detection and interest tracking in a dynamic online news source , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[8]  Byeong Ho Kang,et al.  Dynamic Web content filtering based on user's knowledge , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[9]  Athena Vakali,et al.  A fuzzy bi-clustering approach to correlate web users and pages , 2009, Int. J. Knowl. Web Intell..

[10]  Dennis Shasha,et al.  WebFilter: A High-throughput XML-based Publish and Subscribe System , 2001, VLDB.

[11]  S. C. Hui,et al.  Neural Networks for Web Content Filtering , 2002, IEEE Intell. Syst..

[12]  Reihaneh Safavi-Naini,et al.  Web filtering using text classification , 2003, The 11th IEEE International Conference on Networks, 2003. ICON2003..

[13]  Jacob Groshek,et al.  Embedding the Internet in the Lives of College Students , 2008 .

[14]  L OganChristine,et al.  Embedding the Internet in the Lives of College Students , 2008 .

[15]  José María Gómez Hidalgo,et al.  Named Entity Recognition for Web Content Filtering , 2005, NLDB.

[16]  C. Chantrapornchai,et al.  Experimental studies on pornographic web filtering techniques , 2008, 2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[17]  Peretz Shoval,et al.  Information Filtering: Overview of Issues, Research and Systems , 2001, User Modeling and User-Adapted Interaction.

[18]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[19]  Choochart Haruechaiyasak,et al.  A web-page recommender system via a data mining framework and the Semantic Web concept , 2006, Int. J. Comput. Appl. Technol..

[20]  Paul Resnick,et al.  PICS: Internet access controls without censorship , 1996, CACM.

[21]  Liming Chen,et al.  WebGuard: a Web filtering engine combining textual, structural, and visual content-based analysis , 2006, IEEE Transactions on Knowledge and Data Engineering.

[22]  Paul A. Watters,et al.  Statistical and structural approaches to filtering Internet pornography , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[23]  Keith L. Clark,et al.  Hierarchical Agglomerative Clustering for Agent-Based Dynamic Collaborative Filtering , 2004, IDEAL.

[24]  David M. Pennock,et al.  Using web structure for classifying and describing web pages , 2002, WWW.

[25]  Enrique Herrera-Viedma,et al.  A filtering and recommender system for e-scholars , 2010 .

[26]  Paul Greenfield,et al.  Effectiveness of internet filtering software products , 2001 .

[27]  Robin van Meteren Using Content-Based Filtering for Recommendation , 2000 .

[28]  Georgios Paliouras,et al.  Automatic Web Rating: Filtering Obscene Content on the Web , 2000, ECDL.