Order Statistics in Random-Order Data Stream

10 Apr
Tuesday, 04/10/2012 12:00pm to 1:00pm
Theory Seminar

Dan Stubbs, a UMass CMPSCI undergraduate

Computer Science Building, Room 140

Faculty Host: David Mix Barrington

The problem of finding characteristic elements of a data set naturally emerges in the context of massive data streams. Algorithms for exact or approximate order statistics using space sublinear in the input length are of great interest. We present two algorithms for finding order statistics in random order streams. The first extends a median-finding algorithm from 1980 due to Munro and Paterson and exactly finds elements of rank k in O(sqrt(klogn)) space with high probability. The second, from 2012, due to McGregor and Valiant, finds an additive n^(1/3+o(1)) approximation to the median storing only a constant number of counters.