This is not the obvious solution to the problem? I used to use this method to compute test averages in my head when I was in elementary/middle school. Teachers were often willing to read off the (anonymized) scores of a test, but weren't willing to do the math and figure out the average. Also used it in college to figure out what grades I needed to stay above a 3.0 GPA, and on the SATs and AHSME to figure out my margin of safety.
IMHO, a more interesting interview question is:
"How would you efficiently compute a sliding-window average, where each number has an associated time and each time a number comes in, you want to quickly compute the average of all numbers whose timestamps fall within the last 5 minutes?"
This type of problem comes up all the time in the financial industry - I had to implement it for my last employer. The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)
The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)
It's easy to prove that linear space is optimal -- you can't store N bits of information using less than N bits of space.
Well, here's a naive one written in Lua. it keeps a fifo list of {timestamp, number} pairs. The fifo is implemented as a double-ended queue in a closure. The averaging function takes in the current time and a new number. Pop off the pairs that are outside of our sliding window and subtract their value from a tally. Push on our new value and add its value to the tally. Return the tally divided by the size of the queue. This algorithm takes linear time with respect to the rate of transactions (i.e. how many do we expire each call) and linear space with respect to window size (how many entries fit in our sliding window).
function create_sliding_averager (window_span, debug)
local window_span = window_span -- seconds
local tally = 0
local fifo = { tail = 1, head = 0 }
local debug = debug
return function (amount, current_time)
local head = fifo.head + 1
local tail = fifo.tail
fifo[head] = { timestamp = current_time, amount = amount }
tally = tally + amount
local cutoff, old_time = current_time - window_span, nil
if fifo[tail] then
old_time = fifo[tail].timestamp
end
if debug then
print ('cutoff: ', cutoff, 'old time:',
old_time, 'current:', current_time)
end
while old_time ~= nil and old_time < cutoff do
tally = tally - fifo[tail].amount
if debug then
print ('expiring: ', fifo[tail].timestamp, fifo[tail].amount)
end
fifo[tail] = nil
tail = tail + 1
old_time = fifo[tail].timestamp
end
fifo.tail = tail
fifo.head = head
if debug then print(tail, head) end
return 'avg: '..(tally / (head - tail + 1)), 'size: '..(head - tail + 1)
end
end
averager = create_sliding_averager(10) -- 10 second average
while true do
local rand = math.random()
local time = os.time()
print (averager(rand, time))
end
I would usually use a small library to implement (push(), pop(), peek()) on the fifo but this example will run in a stock interpreter. On my system the average for rand() hovers around .5 so it passes my initial smell test. It's hovering around 80k entries in a busyloop with a 10 second window.
This is a pretty brute-force solution. I'll see if I can think of something more elegant.
That was the algorithm I used, though of course it was in Java. It's actually amortized constant time (assuming a sensible dequeue implementation), because each number you push on can be popped off only once. If you end up popping lots of entries at once, it means that you went a long time without popping anything as the dequeue was populated.
IMHO, a more interesting interview question is:
"How would you efficiently compute a sliding-window average, where each number has an associated time and each time a number comes in, you want to quickly compute the average of all numbers whose timestamps fall within the last 5 minutes?"
This type of problem comes up all the time in the financial industry - I had to implement it for my last employer. The best algorithm I can think of takes amortized constant time and linear (proportional to window size) space. Anyone able to think of a better one? Wall Street wants you. ;-)