Given a server that has reques

Google Interview Question for Software Engineer / Developers

4

of 6 votes

42
Answers
Given a server that has requests coming in. Design a data structure such that you can fetch the count of the number requests in the last second, minute and hour.
- AVK June 15, 2013 in United States | Report Duplicate | Flag | PURGE
Google Software Engineer / Developer Data Structures

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: Phone Interview

More Questions from This Interview

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 5 vote

- Use a queue implemented as a resizable array to store the timestamps of all new requests
- maintain head/tail pointers as usual
- Also maintain three pointers: ptr_last_1hr_from_tail, ptr_last_1min_from_tail, ptr_last_1sec_from_tail

Updates to the Queue:
everytime a new timestamp is enqueued (added to tail):
- continue dequeueing while(timstamp[head]-timestamp[tail] > 1hour) . this ensures that we dont continue to keep storing timestamps we dont need.
- set ptr_last_1hr_from_tail = head
- decrement ptr_last_1min_from_tail until (timestamp[ptr_last_1min_from_tail] -timestamp[tail] > 1min)
- decrement ptr_last_1sec_from_tail until (timestamp[ptr_last_1sec_from_tail]-timestamp[tail] > 1sec)

Fetch counts from Queue:
On every new fetch request:
- for last 1 hour count: set fetch_1hr_ptr to ptr_last_1hr_from_tail and decrement fetch_1hr_ptr wile timestamp[fetch_1hr_ptr]-fetch_timestamp > 1hr ;
- for last 1 min count: set fetch_1min_ptr to ptr_last_1min_from_tail and decrement fetch_1min_ptr wile timestamp[fetch_1min_ptr]-fetch_timestamp > 1min;
- for last 1 sec count: set fetch_1sec_ptr to ptr_last_1sec_from_tail and decrement fetch_1sec_ptr wile timestamp[fetch_1sec_ptr]-fetch_timestamp > 1sec ;
- req count in last 1hr = (fetch_1hr_ptr-tail)
- req count in last 1min = (fetch_1min_ptr-tail)
- req count in last 1sec = (fetch_1sec_ptr-tail)

Depending on the nature of the request traffic, we could also use binary search when updating the various pointers instead of linear decrement.

- whatevva' June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

what if there is no requests for more than a second duration? This method would render ptr_last_1_sec_from_tail invalid!!

- useless_nail February 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

well, if both fetching and queueing are in the same world( time framework), use same set of pointers as opposed to using last_1_houir_from_tail and fetch_last_1_hour ptrs.

- useful_nail February 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

fetching last_1{hour,min,sec}_counts is O(N). Is this possible in constant time ?

- useless_nail February 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Replace last_1_{hour,min,sec}_ptr with a ptr structure which would maintain a count of timestamps till tail. If you gonna add nodes to tail, +1 all the pointers structures. If you dequeue from head, just update last_1_hour_ptr unless you this runs over last_1_{min,sec}_ptrs, in which case update them.

- useful_nail February 27, 2014 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

Problem statement says that query can be for last second, last minute or last hour. We have to maintain list for each window.

My Solution:
--> Maintain a list of seconds, minutes, hour of size 60, 60 & 1 respectively.

Now, basically we have to run two operations:
1) Update all three array when user logs in.
2) Update all three array after each second. (All three array need not to be updated per second)

So array will be like:
second[60] = {0,0,0,0...0}
minute[60] = {0,0,0,0...0}
hour[24] = {0}

Now 1st operation: (Simple)
Whenever any user logs in, increment each arrays' 0th index. 0th index will always point to the last second, minute, hour.

Now 2nd operation:
Keep updating newest & keep outdating oldest index values. This can be done by timeout operation. Timeout value is 1 sec:

1) At each second:
    i) 60th node of array is removed & 0th node is added to second's array. (Not for initial 60 second when only insertion operation required) 
    ii) value of 60th node of second array is subtracted from 0th node of minutes array.(Not for initial 60 second)
2) At each 60th second, a new node with value of current node is inserted at the start of minute array. Now, minute array has 2 node
    i)  1st node will have value of 0th minute. (initial value of this node is same as old 0th node)
    ii) 2nd node will have value of 1st minute.
3) At each 60th min,
     i ) 60th node of minute array is removed & 0th node is inserted
     ii) hour index will be subtracted with 60th minute index value.

In above solution, I have taken array but all operation has been described as link link operation (insert & remove) for sake of simplicity.
So either we can take array of link list or these operation can be done like rotating array. We need to keep start & end index for each array.

- SK June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

at every second, hour array has to subtract the 60th element of second's array

- Anonymous June 23, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

The server requests is like a continuous input stream.
For the time consider a moving window with 1 second or 1 minute or 1 hour as the size of the window.
Consider last 1 second.
Maintain a HashMap<String, Integer> with key as the url and value as the count of the url requested.
Begin time is the time from when we want last 1 second url requests.
So from the log consider keep scanning till (time stamp for the request - begin time) <= 1 second.
For every such request satisfying the time criteria, if new request add the url to a hashmap with count as 1 ; else increment the count for the existing matching url.
Once a url is reached such that (time stamp for the request - begin time) > 1 second flush all the url and the counts onto some file and create a new HashMap.

- Subhajit June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Did you take into account that you may have millions of requests for one hour ?

- gevorgk June 18, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

A balanced binary tree such as AVL or red block and arrange nodes as per arrival time of request. Every node should also have a counter of no.of nodes rooted at the left child of that node.

Here all operations takes O(logn)

- Raj June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use a max-heap and put the events as them come in. The latest event will be at the top. Time complexity is O(n log n) for inserts.
Then you can query the heap with max() in O(1) time and extract-max() in O(log n) time and get the events in last sec, min, or hour.

- oOZz June 15, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Isn't the time complexity for inserting into a max-heap O(log n)?

- niki June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@niki Yes, O(log n) for single insert, but O(n log n) for n requests

- oOZz June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Can you elaborate on a bit on how to get the number of events?
Heap returns the min or max in O(1) time. Also, one problem I see is removal of elements from the heap when the time window slides.

- arun_lisieux June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 3 votes

Max-heap, binary trees - all these add no value - the data are already created in a sorted order (by timestamp). The problem is with the concurrency.

- galli August 10, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Add request listener to track request.

long startTime = new Date().getTime();

int lastSecondRequests = 0;
long lastMinuteRequests = 0;
long lastHourRequests = 0;

long whichSecond = 0;
long whichMinute = 0;

void requestHandler(HttpRequest request) {
	long currentTime = new Date().getTime();
	
	long secondSpan = Math.round((currentTime - startTime) / 1000 + 0.5);
	long minuteSpan = Math.round((currentTime - startTime) / 60 * 1000 + 0.5);
	long hourSpan = Math.round((currentTime - startTime) / 3600 * 1000 + 0.5);

	if(hourSpan > whichHour)  {
		whichHour = HourSpan;
		lastHourRequests = 1;
	} else {
		lastHourRequests++;
	}

	if(minuteSpan > whichMinute)  {
		whichMinute = minuteSpan;
		lastMinuteRequests = 1;
	} else {
		lastMinuteRequests++;
	}
	
	if(secondSpan > whichSecond)  {
		whichSecond = secondSpan;
		lastSecondRequests = 1;
	} else {
		lastSecondRequests++;
	}
	
}

- Debasis Jana June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

I forgot to add whichHour ;)

Just you to need to retrieve last information not history, so I think there is o need to organize whole history data, is it?

- Debasis Jana June 16, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

not mutual exclusion at all?

- elbek June 17, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

If a request comes in 4.75 hours after start-time and another comes in 5.25 hours after start-time, then when you do the counts at 5.50 hours after start-time, are you going to count the 4.75 one as well?

Not sure if I understand your code correctly, but looks like when the 5.25 one comes in you would reset your lastHourRequests to 1?

- Sunny June 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Isn't it simpler to just store the timestamps of all requests from current time - hour up to current time in a queue?

You can create the a queue which contains all the timestamps as requests come in.
Then every time interval say every 1millisecond remove all the timestamps in the head which is 1 hour old.

All you have to do to get the count is is peek from the head of the queue up to last the time stamp which is current queue timestamp > current time - hour/minute/seconds.

Or alternatively just create a score array say int score[3] and update every millisecond by reading the queue as described above and update the counters.

- pedropenduko June 19, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use a RingBuffer of Size 3600, each element of the buffer is an int (or maybe an AtomicInt) and represents a counter for one second.

We'll have a variable for the current postion in the buffer, this will be used by readers to figure out where to start reading. And by the writer to know when it's advanced and needs to set the current position's counter to 1.

On each update calculate the current postion: requests_time/1000 % 3600

class RequestCounter{
	int[] ring = new int[3600]; // this could be an AtmoicInt array to be thread safe.
        long currentPos = 0;
        public void increment(long requestTs){
            writePos = requestTs/1000 % 3600;
            if(writePos == currentPos){
                  ring[writePos]++;
            }else{
		 ring[writePos] = 1;
                 currentPos = writePos;
            }
        }
        public int  getCountForLast(int seconds){
            if(seconds > 3600) //Error?
  		 seconds = 3600;
            int count = 0;
            int stopAt = currentPos-seconds;
            for(pos = currentPos; pos > stopAt ; pos--){
                  count += ring[pos%3600] ;
            }
            return count;
       }
       //below write methods that call getCountForLast with 1, 60, 3600
  }

Added benefit : RingBuffers will better utilize cache lines and avoid False Sharing.

- esniff June 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

How are the counters reset if there are no request coming in for that specific second (assuming previously that specific second has a value but the counter value was 1 hour old)? This means there is a possibility that the counter values are incorrect depending on the timing of the requests coming in.

- pedropenduko June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

@slbb Your right. And it could be solved by recalculating the current position at read time by having the reader pass in a timestamp or taking one in the read method. Then resetting each counter between the old current_position and the new current_position. In fact you'll need to do that on writes too, when the counter resets.

Also I failed to mention that if you need to go the thread safe route with atomic ints, you'll also need to add double checked locks around the reseting of the counters. Personally this should be avoided, maybe by having an Actors own this, so only one thread is updating the ring.

- esniff June 20, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

One straight forward way may be:
Maintain a LinkedList of nodes with <timeStamp, request> pairs.
As requests come, keep adding to this linked list (sorted by timestamps).

Maintain 3 pairs of pointers.
1st pair of pointers = lastSecondStart, lastSecondEnd
2nd pair=lastMinuteStart, lastMinuteEnd
3rd pair=lastHourStart, lastHourEnd

After every second do below:
lastSecondStart = lastSecondEnd->next;
lastSecondEnd=move right to node by a 'second' using currentTime & node.timeStamp;

lastMinuteStart=move right to node by a 'second' using currentTime & node.timeStamp;
lastMinuteEnd=lastSecondEnd;

lastHourStart=move right to node by a 'second' using currentTime & node.timeStamp;
lastHourEnd=lastMinuteEnd;

Essentially 3 traversals to right. Num of nodes depends on num of requests in that second.

Caveats: 1.If there are not too many requests this works.
2. Concurrent requests can be handled by making add() method synchronized. But too many of these can tamper performance.

- X June 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I already answered to this question.With respect to that answer some1 asked me that I missed synchronization and told if I use lock then it may degrade the performance that means all request can't be tracked per second.

- Debasis Jana June 22, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

we need a data structure that can handle this.

I would go with already synchronized data structures like hashtable. With keys as "previous_hour", "pevious_minute", previous_second", "current_hour", "current_minute", current_second". I will also have variables for current_hour, current_minute, current_second, new_hour, new_minute,new_second.

Every time, a request comes in check if new_second is different than previous, then update the map. Or increase the current counts.

- akkkaa June 25, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I am going to handle this in a controversial way, which is to use a HashMap to store the #requests for each second. Then to calculate the #requests for the past second, minute and hour, I only need 3,600 checks to the HashMap, which is O(1). The downside is that the HashMap can get big really fast, and so we might also want to throw in some code to remove any obsolete keys from it.

class CountRequests {
	private long startTime = new Date().getTime();

	private HashMap<Long, Integer> counts = new HashMap<Long, Integer>();

	void handleRequest() {
		long currentTime = new Date().getTime();
		long offset = (currentTime - startTime) / 1000;
		synchronized(counts) {
			if(!counts.containsKey(offset))
				counts.put(offset, 0);
			counts.put(offset, 1 + counts.get(offset));
		}
	}

	// return the #requests in past second, minute, hour as an array
	int[] getCounts() {
		int[] freq = new int[3];
		long currentTime = new Date().getTime();
		long offset = (currentTime - startTime) / 1000;
		int total = 0;
		synchronized(counts) {
			for(int i=0; i<3600; i++) {
				long index = offset-i;
				if(counts.containsKey(index))
					total += counts.get(index);
				if(i == 0)
					freq[0] = total;
				else if(i == 59)
					freq[1] = total;
				else if(i == 3599)
					freq[2] = total;
			}
		}
		return freq;
	}
}

- Sunny June 27, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

The question is ill-formed. Not clear if they are asking for a continuous timeframe, or discrete time intervals. Discrete time interval problem is solvable... continuous is not (on a real machine executing real machine instructions each cycle).

- Jeff July 20, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I am thinking about a solution which behaves better in concurrent environment.

To lock less often and to save memory, I suggest that we keep the resolution say about 1%. This means we update the count each 10ms for the 1s number, each 0.6 s for the 1min and each 0.6 min for the 1 hour. The 1 min and 1 hour counts are just simpler variant of the same, so let's think only about the 1 sec timer.

We can have a AtomicLong count, to which it adds. When it realizes that 10ms has passed, it takes this count and sends it to a list with the timestamp, similarly to what others described. We can lock on sendToList, but usually there will be no concurrency because it happens just once in 10 ms. The "optimistic locking" in AtomicLong.compareAndSet takes care about the rest of the concurrency.

AtomicLong count = new AtomicLong(0);
// last timestamp
AtomicLong lt = new AtomicLong(System.currentTimeMillis()); 

void newRequest() {
	long c = count.getAndIncrement();
    long t = System.currentTimeMillis();
	long lt2 = lt.get();
	// each 10 ms
	if(t / 10 != lt2 / 10) {
		// try to write the new timestamp - on failure, assume other thread is just doing the same and do nothing
		if(lt.compareAndSet(lt2, t)) {
			sendToList(lt2, c);
		}
	}
}

- galli August 10, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

There is a very simple elegant solution to this using circular queue. Idea is to have a circular queue of size 3600*SamplesPerSec
At every sample interval, collect count till last sample and add it to circular queue.
- To get count of last 1 second, return sum of last SamplesPerSec elements
- To get count of last 1 minute, return sum of last SamplesPerSec*60 elements
- To get count of last 1 hr, return sum of last SamplesPerSec*3600 elements

All operation like returning sum and purging old entries are O(1)

- mithya September 14, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

interface RequestHandler {
    long numOfRequestsInLastSecond();

    long numOfRequestsInLastMinute();

    long numOfRequestsInLastHour();

    void count(LocalDateTime timestamp);
}

class RequestCounter implements RequestHandler {

    private AtomicInteger secondCounter = new AtomicInteger();
    private AtomicInteger minuteCounter = new AtomicInteger();
    private AtomicInteger hourCounter = new AtomicInteger();

    private static final DateTimeFormatter secondFormatter = DateTimeFormatter.ofPattern("MMddyyyyHHmmss");
    private static final DateTimeFormatter minuteFormatter = DateTimeFormatter.ofPattern("MMddyyyyHHmm");
    private static final DateTimeFormatter hourFormatter = DateTimeFormatter.ofPattern("MMddyyyyHH");

    private String currentSecondKey;
    private String currentMinuteKey;
    private String currentHourKey;


    public long numOfRequestsInLastSecond() {
        return secondCounter.longValue();
    }

    public long numOfRequestsInLastMinute() {
        return minuteCounter.longValue();
    }

    public long numOfRequestsInLastHour() {
        return hourCounter.longValue();
    }

    private void countSeconds(String secondKey) {
        secondCounter.getAndUpdate(existingValue -> {
            if (secondKey.equals(currentSecondKey)) {
                return existingValue + 1;
            } else {
                currentSecondKey = secondKey;
                return 1;
            }
        });
    }

    private void countMinutes(String minuteKey) {
        minuteCounter.getAndUpdate(existingValue -> {
            if (minuteKey.equals(currentMinuteKey)) {
                return existingValue + 1;
            } else {
                currentMinuteKey = minuteKey;
                return 1;
            }
        });
    }

    private void countHours(String hourKey) {
        hourCounter.getAndUpdate(existingValue -> {
            if (hourKey.equals(currentHourKey)) {
                return existingValue + 1;
            } else {
                currentHourKey = hourKey;
                return 1;
            }
        });
    }

    public void count(LocalDateTime timestamp) {
        countSeconds(secondFormatter.format(timestamp));
        countMinutes(minuteFormatter.format(timestamp));
        countHours(hourFormatter.format(timestamp));
    }
}

- Anand Mani June 30, 2016 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

... Just a long long circular queue of 3600 elements will suffice and all operations will take constant time.

- glassrose October 29, 2017 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 3 vote

Sorry for misspell ..
I am not sure Google really asks so easy question? :P

- Debasis Jana June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

That is not easy question dude. Your answer does not take care of concurrency, so in a second there will be 1000 hits and 1000 threads trying to update the same lastSecondRequests, u might end up getting 1 or any value less than 1000 since no locking, if u lock it, u are loosing multithreading for server and billions of requests will be handled sequentially. This task is not that simple. We should not increment or decrement something, we should write changes in different cells maybe to avoid concurrency. Not sure.

- That is not easy question dude. June 17, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Mr. Jana - Can you answer that please?

- fuubar June 18, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Oops!, in excitement I forget to put them in synchronization ;)
Thanks a lot .... how I missed x-(

- Debasis Jana June 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I prefer to use locking on the numeric increment and decrement operation.
Does it really degrade the performance because numeric operation is so fast (few set of machine instructions)

- Debasis Jana June 19, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

don't say it's easy so easily.

Let's say startTime is 0:00 and now we're at 2:15, I say give me requests for the past hour, I expect you to return requests from 1:15 to 2:15

- lol July 28, 2013 | Flag

Comment hidden because of low score. Click to expand.

-1

of 1 vote

import java.util.Date;

class RequestHandler {
    
    long counter = 0;
    
    synchronized void incrementCounter() {
        counter++;
    }
    
    long getCounter() {
        return counter;
    }
    
}



class Request implements Runnable {
    
    RequestHandler handler = null;
    
    Request(RequestHandler handler) {
        this.handler = handler;
    }

    @Override
    public void run() {
        this.handler.incrementCounter();
    }
    
}


/**
 * 
 */
public class ThreadTest {
    
    static final long MAX_REQUEST = 18000;
    
    public static void main(String ... args) throws InterruptedException {
        RequestHandler handler = new RequestHandler();
        
        long start_time = new Date().getTime();
        System.out.println("Before request thread(s) creation .....");
        //System.out.println("Start time : " + start_time);
        
        for(long i = 0; i < MAX_REQUEST ; i++) {
            new Thread(new Request(handler)).start();
        }
        
        long end_time = new Date().getTime();
        System.out.println("Immediate after request thread(s) creation .....");
        System.out.println("Duration : " + (end_time - start_time));
        start_time = end_time;
        System.out.println("Request Counter : " + handler.getCounter());
        
        Thread.sleep(1000);
        
        end_time = new Date().getTime();
        //System.out.println("After one minute of request thread(s) creation.....");
        //System.out.println("Duration : " + (end_time - start_time));
        //System.out.println("Request Counter : " + handler.getCounter());
    }
    
}

I tried this code in my 8GB RAM, 64bit environment and 3.40 GHZ in windows environment with 18000 request.

and I got the result ..
Before request thread(s) creation .....
Immediate after request thread(s) creation .....
Duration : 967
Request Counter : 18000

- Debasis Jana June 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Here instead of using any complex DS or object, I think using simple synchronized increment operation it's possible to track much more request per second.

If I don't put the increment operation in synchronization then duration is not so less but just 10 15 ms less (though it's too much considerable in large data computing)

NB: Google has much more highly configured server ;)

- Debasis Jana June 21, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I believe you did not solve the problem. How are you going to get the count on the last second/minute/hour?

- pedropenduko June 21, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

- Debasis Jana June 22, 2013 | Flag

CareerCup

Google Interview Question for Software Engineer / Developers

Books

Videos

Resume Review

Mock Interviews