Finding modes with equality comparisons

Abstract We consider the comparison complexity of finding modes (the most frequently occurring elements) in a list of elements that are not necessarily from a totally ordered set. Here, the relation between elements is determined by equality comparisons whose outcome is = when the two elements being compared are equal, and ≠ otherwise. The problem generalizes the classical majority problem studied in this model (using equalities). We show that n 2 / 2 m − n / 2 comparisons are necessary and n 2 / m + n comparisons are sufficient to find an element that appears at least m times. This is in sharp contrast to the bound of Θ ( n log ⁡ ( n / m ) ) bound in the model where comparisons are , = , > or ≤ , > . We give three algorithms for finding mode, including one that is a generalization of a classical majority finding algorithm due to Fischer and Salzberg (1982) [9] . We also discuss upper and lower bounds for sorting (i.e., finding the frequency of every element) and for finding the least frequent element. Sorting problem (under the equality comparisons) also known as equivalence class sorting , has applications in several scenarios where the total order of elements is either not possible or can not be revealed for security reasons.

[1]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[2]  J. Ian Munro,et al.  Sorting and Searching in Multisets , 1976, SIAM J. Comput..

[3]  S. Srinivasa Rao,et al.  Sorting and Selection with Equality Comparisons , 2015, WADS.

[4]  P. Erdös On an extremal problem in graph theory , 1970 .

[5]  G. Dirac Some Theorems on Abstract Graphs , 1952 .

[6]  B. Bollobás,et al.  Extremal Graph Theory , 2013 .

[7]  Michael T. Goodrich,et al.  Parallel Equivalence Class Sorting: Algorithms, Lower Bounds, and Distribution-Based Analysis , 2016, SPAA.

[8]  S. Srinivasa Rao,et al.  Finding Mode Using Equality Comparisons , 2016, WALCOM.

[9]  Robert S. Boyer,et al.  MJRTY: A Fast Majority Vote Algorithm , 1991, Automated Reasoning: Essays in Honor of Woody Bledsoe.

[10]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[11]  J. Ian Munro,et al.  Compressed Data Structures for Dynamic Sequences , 2015, ESA.

[12]  David P. Dobkin,et al.  Determining the Mode , 1980, Theor. Comput. Sci..

[13]  René Schott,et al.  The Average-Case Complexity of Determining the Majority , 1997, SIAM J. Comput..

[14]  Edward M. Reingold,et al.  Determining the Majority , 1993, Inf. Process. Lett..

[15]  Michael Werman,et al.  On computing majority by comparisons , 1991, Comb..

[16]  Edward M. Reingold,et al.  On the Optimality of Some Set Algorithms , 1972, JACM.

[17]  Edward M. Reingold,et al.  Analysis of Boyer and Moore's MJRTY algorithm , 2013, Inf. Process. Lett..