Categories of Direlations and Rough Set Approximation Operators

RSCTC’2010 Discovery Challenge was a special event of Rough Sets and Current Trends in Computing conference. The challenge was organized in the form of an interactive on-line competition, at TunedIT.org platform, in days between Dec 1, 2009 and Feb 28, 2010. The task was related to feature selection in analysis of DNA microarray data and classification of samples for the purpose of medical diagnosis or treatment. Prizes were awarded to the best solutions. This paper describes organization of the competition and the winning solutions. TunedIT.org: System for Automated Evaluation of Algorithms in Repeatable Experiments Marcin Wojnarski, Sebastian Stawicki, and Piotr Wojnarowski 1 TunedIT Solutions Zwirki i Wigury 93 lok. 3049, 02-089 Warszawa, Poland 2 Faculty of Mathematics, Informatics and Mechanics, University of Warsaw Banacha 2, 02-097 Warszawa, Poland Abstract. In this paper we present TunedIT system which facilitates evaluation and comparison of machine-learning algorithms. TunedIT is composed of three complementary and interconnected components: TunedTester, Repository and Knowledge Base. TunedTester is a stand-alone Java application that runs automated tests (experiments) of algorithms. Repository is a database of algorithms, datasets and evaluation procedures used by TunedTester for setting up a test. Knowledge Base is a database of test results. Repository and Knowledge Base are accessible through TunedIT website. TunedIT is open and free for use by any researcher. Every registered user can upload new resources to Repository, run experiments with TunedTester, send results to Knowledge Base and browse all collected results, generated either by himself or by others. As a special functionality, built upon the framework of automated tests, TunedIT provides a platform for organization of on-line interactive competitions for machine-learning problems. This functionality may be used, for instance, by teachers to launch contests for their students instead of traditional assignment tasks; or by organizers of machine-learning and data-mining conferences to launch competitions for the scientific community, in association with the conference. In this paper we present TunedIT system which facilitates evaluation and comparison of machine-learning algorithms. TunedIT is composed of three complementary and interconnected components: TunedTester, Repository and Knowledge Base. TunedTester is a stand-alone Java application that runs automated tests (experiments) of algorithms. Repository is a database of algorithms, datasets and evaluation procedures used by TunedTester for setting up a test. Knowledge Base is a database of test results. Repository and Knowledge Base are accessible through TunedIT website. TunedIT is open and free for use by any researcher. Every registered user can upload new resources to Repository, run experiments with TunedTester, send results to Knowledge Base and browse all collected results, generated either by himself or by others. As a special functionality, built upon the framework of automated tests, TunedIT provides a platform for organization of on-line interactive competitions for machine-learning problems. This functionality may be used, for instance, by teachers to launch contests for their students instead of traditional assignment tasks; or by organizers of machine-learning and data-mining conferences to launch competitions for the scientific community, in association with the conference. Consensus Multiobjective Differential Crisp Clustering for Categorical Data Analysis Indrajit Saha, Dariusz Plewczyński, Ujjwal Maulik, Sanghamitra Bandyopadhyay 1 Interdisciplinary Centre for Mathematical and Computational Modeling (ICM), University of Warsaw, 02-089 Warsaw, Poland. Email: (indra,darman)@icm.edu.pl 2 Department of Computer Science and Engineering, Jadavpur University, Kolkata-700032, West Bengal, India. Email: drumaulik@cse.jdvu.ac.in 3 Machine Intelligence Unit, Indian Statistical Institute, Kolkata-700108, West Bengal, India. Email: sanghami@isical.ac.in Abstract. In this article, an evolutionary crisp clustering technique is described that uses a new consensus multiobjective differential evolution. The algorithm is therefore able to optimize two conflicting cluster validity measures simultaneously and provides resultant Pareto optimal set of non-dominated solutions. Thereafter the problem of choosing the best solution from resultant Pareto optimal set is resolved by creation of consensus clusters using voting procedure. The proposed method is used for analyzing the categorical data where no such natural ordering can be found among the elements in categorical domain. Hence no inherent distance measure, like the Euclidean distance, would work to compute the distance between two categorical objects. Index-coded encoding of the cluster medoids (centres) is used for this purpose. The effectiveness of the proposed technique is provided for artificial and real life categorical data sets. Also statistical significance test has been carried out to establish the statistical significance of the clustering results. Matlab version of the software is available at http://bio.icm.edu.pl/∼darman/CMODECC.