Your job is to build a data st

Microsoft Interview Question

0

of 0 votes

20
Answers
Your job is to build a data structure to maintain a set of photographs. Your photograph database should allow you to insert and search for photographs, as well as to designate some of the photographs as favourites by marking them. In more detail, your data structure should support the following operations:

Insert(x, t, m): inserts photograph x that was taken at time t. If m = 1, then the photograph is marked as a favourite; if m = 0, then it is not a favourite.

Search(t): find the next photograph taken after time t.

NextFavorite(t): find the next photograph taken after time t that is a favourite.

Give a data structure for solving this problem, and explain how it works. For more efficiency, your data structure should be both time and space efficient: more points will be given for more efficient solutions. (Remember that the photographs themselves may be quite large.) Give the performance (i.e., running time) of each operation and the space usage of the data structure. You may assume that at any given time t, your camera has taken at most one photograph x. Describe your solution in words, or very high-level pseudocode. You do not need to provide complete details. You must provide enough detail, however, that your solution is clear.
- suneric May 28, 2015 in United States | Report Duplicate | Flag | PURGE
Microsoft Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: United States
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 1 vote

Use 2 B-Trees for storing the photo Node indexes based on their timestamp, one tree for favorites and other for non-favorites so search will be
O (Log N), you can search for first immediate node after Timestamp T, this will still be O (log N), insert and remove O(log N) each node of the tree would store a key value which would point to a HashMap<key, photo> which would store the actual photo data (or else retrieve it from a persistent storage).

- guilhebl May 29, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Correct.
You can reduce some key space if you hold images only in the full B-Tree and add links from the favorites B-Tree nodes to the proper node in the full B-Tree to retrieve the image. Thus avoiding the HashMap all together.

But image space is usually going to be much bigger then key space.

- Gil June 01, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Use a binary tree of some type (pick your favorite flavor). Each node of the tree would preserve a single photo and a flag indicating if the photo is a favorite. The tree is sorted on the time that the photographs are taken.

Insert(x, t, m): inserts a new tree node at a location 't' containing photo 'x' and flagged favorite = 'm'==1. This operation is O(log n) complexity and takes O(1) memory where n is the number of stored photos

Search(t): do a search for the first node that occurs immediately after 't'. This operation is O(log n) in complexity and O(1) memory where n is the number of stored photos

NextFavorite(t): do a search for the first node with the flag == true and time after 't'. This operation is also O(log n) in complexity and O(1) memory where n is the number of stored photos

For the most part, this is the 90% immediate solution. There are naturally other things that could be done like bucketing chunks of time or maintaining separate trees for the favorites vs all the photos.

- zortlord May 28, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Use multiple hash maps
like hashmap : (key date/value hashmap : (key hour/value hashmap : (key minutes/value tree storing data in order of time)) )...

The granularity can be a design decision, but should be flexible to change and add more HashMaps or remove them , may be by some intellegent layer to retreive the photos efficiently..

- argho May 29, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 1 vote

NextFavorite(t) would not be logarithmic time under that structure. You'd have to start an in-order traversal from the first node greater than t. Finding that first node is logarithmic but the traversal could touch every remaining node in the tree.

For instance, NextFavorite(0) in a tree where no nodes are marked favorite.

This could be addressed by using two trees, one of which only holds favorited pictures.

- Jon May 29, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

In your approach NextFavorite(t) will take O(n).

- Sergey May 29, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

I think a bigger concern should be the order of insertion as an input with ascending order as in reality could make all operations ( search , insert , delete) O(N),, hence better to go for red black trees

- monica_shankar June 01, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

B+ tree is a data structure that can be used here.

Leaf nodes contain (pointer to photograph stored in memory, time and favorite). and all leaf nodes are connected.

Now for inserting new record it will take O(logn base b) b - order of b+ tree
Search also it takes O(logn base b) Time complexity.

- srikiran May 28, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- kiran007 May 28, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- argho May 29, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

how would you perform the search in this case? can you detail your search operation and the time complexity for it?

- guilhebl May 29, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

guilhebi, correct, it is difficult to imagine separate chaining for searching and nextfavsearch.

- monica_shankar June 01, 2015 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

- argho May 29, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

In the game of Jack Straws, a number of plastic or wooden "straws" are dumped on the table and players try to remove them one-by-one without disturbing the other straws. In the question here, we are concerned with the following, related problem: First n straws are dumped on the table. Next, for every pair of straws, we want to determine if the straws are connected by a path of touching straws. In other words, given a list of the endpoints for n > 1 straws (as if they were dumped on a large piece of graph paper), determine all the pairs of straws that are connected. Note that touching is connecting, but also two straws can be connected indirectly via other connected straws. Give an algorithm to compute all the pairs of straws that are connected.
Here is some additional information that you can use:

i.As a basic step you need to determine whether two straws directly touch (intersect), i.e., whether straw ab with end points a and b intersects with straw cd (with end points c and d).
ii. To get full credit also explain how the elementary step i) can be done. If you cannot find an algorithm for this step, assume an O(n 2 ) algorithm and explain how to find all pairs of straws that are connected.
Analyse and explain the worst-case computational complexity of your algorithm in terms of n.

- Anonymous May 29, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Hi, can somebody provide the proper answer to this question please

- Poonam May 30, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Hi, Can somebody provide the proper solution to this question please

- Codie May 30, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

1. Have a linked list of timestamps that have at least 1 photo; ordered by time
2. Have a hash that maps from t to:
    a) the corresponding timestamp in the linked list above
    b) a set of photos taken at t that are not favourite
    c) a set of photos taken at t that are marked as favourite

The linked list of timestamps is to quickly find the next available timestamp in O(1) time.
The hash allows for finding a timestamp along with the next available timestamp and photos taken at t, all in O(1) time.

insert(x,t,m): add timestamp t to LL and hash if not exists. Add x to set of photos taken at t (which set depends on m).

search(t) and nextFavourite(t): use hash to find t in linked list, then call next() to get the next timestamp t' if exists. Can then use hash to find t' and the next photo after time t that has a favourite or non-favourite.

Overall, every type of operation (add, remove, search) will be O(1) time.
Space is O(n) where n is the number of photos.

- Jason May 30, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I have come up with BST approach

it would be a binary search tree indexed by time t

1. Insert operation: same as the standard put operation. Avg case O(lg N) worst case O(N) ,, memory O(1)

2. Search operation: this is like a ceiling operation in BST where we find the node that is immediately greater than the current key value
   Avg case: O(lg N) , worst case : O(N)  memory: O(1)
   
   public Node search(int t)
   {
      return search(root,t);  // Assume that we have a pointer to the root.
	  }
	  
	  private Node search(Node x, int t)
{
    if (x==null) return null;
	
	else if(compareTo(x.t,t)<0)   // x.t less than t , focus on right
       return search(x.right,t);
	   
	else if(compareTo(x.t,t)>=0)
{
    Node t=min(x.left);  // check to ensure that the we don't have an element which is smaller than the current element x
	if(t!=null)
	return t;
	else
	return x;
	}
	}
	
3. NextFavourite(t): similar operation as above but replace min(x.left) with minfav(x.left) 
   avg case : O(lg N) ,, worst case O(N) , memory O(1)
   
   minfav(Node x)
   {
   
   Node y;
   
   if(x==null) return null;
   
   y=minfav(x.left);
   
   if(y==null)
   {
   if(x.fav==1);
   return x
   else
   y=minfav(x.right);
   }
   return y;
   }

- monica_shankar June 01, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Here is the complete solution for the problem
Please note that, the author has emphasized on the size of data, so we will store the photographs in external memory only.

here is the background and assumptions behind the algorithm

We will use B+ tree to implement this, where all the internal nodes only store the indexes and all the terminal node stores the real data(the photographs)
The indexes will stay in the main memory only where the assumption the indexes can stay in the memory(primary memory), if not, we have to save the indexes also into the secondary memory, which will increase few I/O operations.
we will keep two trees, one to store normal images and another to store favourites
Here is the algorithm for Search(t)

Search(t)
	Figure out the memory block where the photographs are stored which are inserted after t both from normal tree and favourite-tree.
		Node = T.Root;Directive = Next_Comparison_Directive(null); // return year 
		while(true)
			If Node.IsDevisionPresent
				Node = Node.Child_a ? Node.Child_a.Year = t.Year(can use BST to figure out the child, as not Year have the child)
				Directive = Next_Comparison_Directive(Directive); // return in following order Year > Month > Day > Hour > Minute > Second > MillSecond as so on
			else
				read Node.Photograph block from secondary memory(we may have to read two blocks)
				break;
	After that, you can do the BST to find out an images whose insert time is just next to given t
	return the one having minimum time

Here is the algorithm for NextFavourite(t)

NextFavourite(t)
	Figure out the memory block where the photographs are stored which are inserted after t from favourite-tree.
		Node = T.Root;Directive = Next_Comparison_Directive(null); // return year 
		while(true)
			If Node.IsDevisionPresent
				Node = Node.Child_a ? Node.Child_a.Year = t.Year(can use BST to figure out the child, as not Year have the child)
				Directive = Next_Comparison_Directive(Directive); // return in following order Year > Month > Day > Hour > Minute > Second > MillSecond as so on
			else
				read Node.Photograph block from secondary memory(we may have to read two blocks)
				break;
	After that, you can do the BST to find out an images whose insert time is just next to given t
	return the image.

Here is the algorithm for Insert(x,t,m)

Insert(x,t,m)
	Based upon m decide the tree;
	Figure out the memory block where the photographs can be stored with t time.
		Node = T.Root;Directive = Next_Comparison_Directive(null); // return year 
		while(true)
			If Node.IsDevisionPresent
				Node = Node.Child_a ? Node.Child_a.Year = t.Year(can use BST to figure out the child, as not Year have the child)
				if(Node is null or no node exists, add a new child to parent node with directive value, and insert this new data into the block of this new node).
				Directive = Next_Comparison_Directive(Directive); // return in following order Year > Month > Day > Hour > Minute > Second > MillSecond as so on
			else
				read Node.Photograph block from secondary memory(we may have to read two blocks)
				break;
	insert the entry at proper location in bst.
	while(node is full) // means the block size is full
		set Node.IsDevisionPresent = true;
		figure out distinct count of Directive time units from the block;
		Create one child for each, and set there IsDevisionPresent = false;
		set the list of photographs into each of the child based upon the Directive.
		If any of the node is full(actually there will be one only)
			set node = node.child which is full.
		else
			save all the nodes block to secondary memory.
			break;

Complexities

Search 
	Time : Constant(which is the search time) + 4.Log(M), where M is the a memory block size + 4 I/O
NextFavourite
	Time : Constant(which is the search time) + 2.Log(M), where M is the a memory block size + 2 I/O
Insert
	Time : 
		Best case : Constant(which is the search time) + Log(M), where M is the a memory block size + 2 I/O, which occur most of the time,
		Node spiting case : Constant(which is the search time) + Log(M), where M is the a memory block size +  M.I/O
		With little modification, we can reduce Node spiting case Complexities to the best case, we can maintain Min and Max insert time of photos which are residing in a block at last internal node(which are just above the terminal node)
Space : O(N) : size of the input in secondary memory + constant size of internal memory.

- sonesh June 05, 2015 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

In the game of Jack Straws, a number of plastic or wooden "straws" are dumped on the table and players try to remove them one-by-one without disturbing the other straws. In the question here, we are concerned with the following, related problem: First n straws are dumped on the table. Next, for every pair of straws, we want to determine if the straws are connected by a path of touching straws. In other words, given a list of the endpoints for n > 1 straws (as if they were dumped on a large piece of graph paper), determine all the pairs of straws that are connected. Note that touching is connecting, but also two straws can be connected indirectly via other connected straws. Give an algorithm to compute all the pairs of straws that are connected.
 Here is some additional information that you can use: 

i.As a basic step you need to determine whether two straws directly touch (intersect), i.e., whether straw ab with end points a and b intersects with straw cd (with end points c and d).
ii. To get full credit also explain how the elementary step i) can be done. If you cannot find an algorithm for this step, assume an O(n 2 ) algorithm and explain how to find all pairs of straws that are connected.
Analyse and explain the worst-case computational complexity of your algorithm in terms of n.

- Anonymous May 29, 2015 | Flag Reply

CareerCup

Microsoft Interview Question

Books

Videos

Resume Review

Mock Interviews