Sketching Algorithms for Big Data Fall 2017 Lecture 2 — September 5

In the last lecture, we surveyed the topics for this course, reviewed some probability theory, and considered Morris’s algorithm for approximate counting with small registers. In this lecture, we focus on another streaming problem – counting distinct elements in a stream. We consider an idealized solution to this problem, then outline the non-idealized solution which relies on k-wise independent hash functions.