A scalable system for grouping of large data benches