Multi-criteria Web Mining with DRSA

Abstract This study demonstrates the application of the Dominance principle to a particular case of web (World Wide Web) content search under Multi-criteria approach: searching for “Rio de Janeiro” (City and/or State, in Brazil) followed by other attributes (or criteria). It is known that depending on the content of research that is carried out through a “seeker” (“search engine”) on the Internet, the result may fall short of the desirable, in terms of quantity and quality of the sites returned. The Dominance principle, subsequent to treatment of the collected information (unstructured data) on the Internet, aimed at revealing patterns (or logical rules) on a set of information and showed how a web content search can become more effective at a significant universe of information. Other techniques and tools have been applied to mining content on the Web, and as shown in this study. The choice of the Dominance principle associated to Rough Set Theory as Multi-criteria decision technique is due to the possibility of inaccurate data (inconsistent) and the need for treatment of these inaccuracies when processing an information system (data table) under a mathematical perspective, and do not need a history of these data. The use of Rough Set Theory and the Dominance principle associated with the probabilistic relationship between conditions and decisions in decision algorithms, is showed by the possibility of there being uncertain data to yield an essential set of effectively consistent information.

[1]  Soumitra Dutta,et al.  Granular Computing Models in the Classification of Web Content Data , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[2]  Zdzisław Pawlak,et al.  Rough Sets And Decision Analysis , 2000 .

[3]  S. Greco,et al.  Rough set and rule-based multicriteria decision aiding , 2012 .

[4]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[5]  Vasant Dhar,et al.  Data science and prediction , 2012, CACM.

[6]  Dominik Slezak,et al.  The investigation of the Bayesian rough set model , 2005, Int. J. Approx. Reason..

[7]  Yuefeng Li,et al.  Rough Association Rule Mining in Text Documents for Acquiring Web User Information Needs , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[8]  Zdzislaw Pawlak,et al.  Rough sets, decision algorithms and Bayes' theorem , 2002, Eur. J. Oper. Res..

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  Wojciech Kotlowski,et al.  On Nonparametric Ordinal Classification with Monotonicity Constraints , 2013 .

[11]  Stuart J. Russell,et al.  Unifying logic and probability , 2015, Commun. ACM.

[12]  Sankar K. Pal,et al.  Web mining in soft computing framework: relevance, state of the art and future directions , 2002, IEEE Trans. Neural Networks.