A Novel Crossing Minimization Ranking Method

The ranking problem consists of comparing a collection of observations and deciding which one is “better.” An observation can consist of multiple attributes, multiple data types, and different orders of preference. Due to diverse practical applications, the ranking problem has been receiving attention in the domain of machine learning and statistics, some of those applications being webpage ranking, gene ranking, pesticide risk assessment, credit-risk screening, etc. In this article, we will present and discuss a novel and fast clustering-based algorithmic ranking technique and provide necessary theoretical working. The proposed technique utilizes the interrelationships among the observations to perform ranking and is based on the crossing minimization paradigm from the domain of VLSI chip design. Using laboratory ranking results as a reference, we compare the algorithmic ranking of the proposed technique and two traditional ranking techniques: the Hasse Diagram Technique (HDT) and the Hierarchical Clustering (HC) technique. The results demonstrate that our technique generates better rankings compared to the traditional ranking techniques and closely matches the laboratory results that took days of work.

[1]  Yao-Wen Chang,et al.  Cross-contamination aware design methodology for pin-constrained digital microfluidic biochips , 2010, Design Automation Conference.

[2]  Wenjie Li,et al.  Simultaneous Ranking and Clustering of Sentences: A Reinforcement Approach to Multi-Document Summarization , 2010, COLING.

[3]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[4]  Patrick Gallinari,et al.  Machine Learning Ranking for Structured Information Retrieval , 2006, ECIR.

[5]  Rainer Brüggemann,et al.  Selection of priority properties to assess environmental hazard of pesticides , 1996 .

[6]  W. Trotter,et al.  Combinatorics and Partially Ordered Sets: Dimension Theory , 1992 .

[7]  Amir Hussain,et al.  Analysis of Pesticide Application Practices Using an Intelligent Agriculture Decision Support System (ADSS) , 2012, BICS.

[8]  Eyke Hüllermeier,et al.  Preference Learning: An Introduction , 2010, Preference Learning.

[9]  N. Crickmore,et al.  Revision of the Nomenclature for the Bacillus thuringiensis Pesticidal Crystal Proteins , 1998, Microbiology and Molecular Biology Reviews.

[10]  C. I. Bliss THE TOXICITY OF POISONS APPLIED JOINTLY1 , 1939 .

[11]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[12]  Ganapati P. Patil,et al.  Multicriteria prioritization and partial order in environmental sciences , 2010, Environmental and Ecological Statistics.

[13]  Rainer Brüggemann,et al.  A Theoretical Concept To Rank Environmentally Significant Chemicals , 1999, J. Chem. Inf. Comput. Sci..

[14]  Cynthia Rudin,et al.  Ranking with a P-Norm Push , 2006, COLT.

[15]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[16]  Stefan Szeider,et al.  The Linear Arrangement Problem Parameterized Above Guaranteed Value , 2006, CIAC.

[17]  R. Martí,et al.  A branch and bound algorithm for minimizing the number of crossing arcs in bipartite graphs , 1996 .

[18]  Feng Yang,et al.  Improving robustness of gene ranking by resampling and permutation based score correction and normalization , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[19]  O. Kabak,et al.  A new ranking methodology based on hierarchical cluster analysis , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[20]  Bernard De Baets,et al.  Approximation of average ranks in posets , 2011 .

[21]  Rainer Brüggemann,et al.  A hitchhiker's guide to poset ranking. , 2008, Combinatorial chemistry & high throughput screening.

[22]  Matthew O. Ward,et al.  Mapping Nominal Values to Numbers for Effective Visualization , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[23]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[24]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[25]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[26]  Bernard De Baets,et al.  Exploiting the Lattice of Ideals Representation of a Poset , 2006, Fundam. Informaticae.

[27]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[28]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[29]  Farhad Shahrokhi,et al.  On Bipartite Drawings and the Linear Arrangement Problem , 2001, SIAM J. Comput..

[30]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[31]  A. Tanik,et al.  Transportation pathways of pesticides in two major watersheds of Istanbul, Turkey. , 2004, Water science and technology : a journal of the International Association on Water Pollution Research.

[32]  Julie Boberg,et al.  Health risk assessment of chemical mixtures , 2015 .

[33]  Erkki Mäkinen,et al.  The Barycenter Heuristic and the Reorderable Matrix , 2005, Informatica.

[34]  Jan Ahlers The EU existing chemicals regulation , 1999, Environmental science and pollution research international.

[35]  Mitsuhiko Toda,et al.  Methods for Visual Understanding of Hierarchical System Structures , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[36]  Linda K Teuschler,et al.  Deciding which chemical mixtures risk assessment methods work best for what mixtures. , 2007, Toxicology and applied pharmacology.

[37]  Michael Jünger,et al.  Journal of Graph Algorithms and Applications 2-layer Straightline Crossing Minimization: Performance of Exact and Heuristic Algorithms , 2022 .

[38]  Amir Hussain,et al.  Using Biclustering for Automatic Attribute Selection to Enhance Global Visualization , 2006, VIEW.

[40]  Antonino Gull,et al.  On Two Web IR Boosting Tools: Clustering and Ranking , 2006 .

[41]  A Newman,et al.  Ranking pesticides by environmental impact. , 1995, Environmental science & technology.

[42]  Massih-Reza Amini,et al.  Ranking with Unlabeled Data: A First Study , 2005 .

[43]  Yihong Gong,et al.  Integrating clustering and multi-document summarization to improve document understanding , 2008, CIKM '08.

[44]  Roberto Tamassia,et al.  On the Compuational Complexity of Upward and Rectilinear Planarity Testing , 1994, Graph Drawing.

[45]  Xiaojun Wan,et al.  Improved Affinity Graph Based Multi-Document Summarization , 2006, NAACL.

[46]  Amir Hussain,et al.  A new biclustering technique based on crossing minimization , 2006, Neurocomputing.

[47]  Franz Aurenhammer,et al.  Classifying Hyperplanes in Hypercubes , 1996, SIAM J. Discret. Math..

[48]  Roberto Tamassia,et al.  On the Computational Complexity of Upward and Rectilinear Planarity Testing , 1994, SIAM J. Comput..

[49]  Rafael Martí,et al.  Arc crossing minimization in hierarchical digraphs with tabu search , 1997, Comput. Oper. Res..

[50]  Xiaotie Deng,et al.  Crossings and Permutations , 2005, Graph Drawing.

[51]  Timos K. Sellis,et al.  Learning to rank user intent , 2011, CIKM '11.

[52]  Sun Park,et al.  Multi-document Summarization Using Weighted Similarity Between Topic and Clustering-Based Non-negative Semantic Feature , 2007, APWeb/WAIM.

[53]  Rainer Brüggemann,et al.  Ranking of refrigerants. , 2008, Environmental science & technology.

[54]  Ondrej Sýkora,et al.  Two New Heuristics for Two-Sided Bipartite Graph Drawing , 2002, Graph Drawing.

[55]  David S. Johnson,et al.  Crossing Number is NP-Complete , 1983 .

[56]  Dragomir R. Radev,et al.  Scientific Paper Summarization Using Citation Summary Networks , 2008, COLING.

[57]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[58]  K. L. Seip Restoring water quality in the metal polluted Sørfjorden, Norway , 1994 .

[59]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[60]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[61]  Mark A J Huijbregts,et al.  PestScreen: a screening approach for scoring and ranking pesticides by their environmental and toxicological concern. , 2007, Environment international.

[62]  Jean-Claude Falmagne,et al.  Knowledge spaces , 1998 .

[63]  Warren W. Jederberg,et al.  Toxicology Principles for the Industrial Hygienist , 2008 .

[64]  Claus Möbus,et al.  A Greedy Knowledge Acquisition Method for the Rapid Prototyping of Bayesian Belief Networks , 2005, AIED.

[65]  P. Gramatica,et al.  Ranking and classification of non-ionic organic pesticides for environmental distribution: a qsar approach , 2004 .

[66]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[67]  Ganapati P. Patil,et al.  Ranking and Prioritization for Multi-indicator Systems , 2011 .