Programming Interview Questions

anantpushkar009

Comments (20)

Reputation 120

Page:

Comment hidden because of low score. Click to expand.

of 2 vote

Use stack to simulate inorder traversal of a tree. The rest is simply the linear time 2 sum algorithm. This however, means that you will have to use some extra space to store stack.

class InorderIterator{
	TreeNode *root;
	stack<TreeNode*> node_stack;
	stack<int> state_stack;
public:
	InorderIterator(TreeNode *ptr):
	root(ptr){
		node_stack.push(root);
		state_stack.push(0);
		
		TreeNode *node;
		int state;
		while(!node_stack.empty()){
			node = node_stack.top();
			node_stack.pop();
			
			state = state_stack.top();
			state_stack.pop();
			
			if(state==0){
				node_stack.push(node);
				state_stack.push(1);
				if(node->left!=NULL){
					node_stack.push(node->left);
					state_stack.push(0);
				}
			}else{
				node_stack.push(node);
				state_stack.push(1);
				break;
			}
		}
	}
	TreeNode* get_node(){
		return node_stack.empty() ? NULL : node_stack.top();
	}
	void proceed(){
		if(state_stack.empty()){
			return;
		}
		TreeNode *node;
		int state;
		
		while(true){
			node = node_stack.top();
			node_stack.pop();
		
			state = state_stack.top();
			state_stack.pop();
			
			if(state==0){
				node_stack.push(node);
				state_stack.push(1);
				if(node->left!=NULL){
					node_stack.push(node->left);
					state_stack.push(0);
				}
			}else{
				if(node->right!=NULL){
					node_stack.push(node->right);
					state_stack.push(0);
				}
			}
			if(state_stack.top()==1){
				break;
			}
		}
	}
};

Another idea is to first in-place convert BST to Doubly Linked List (DLL), then find pair in sorted DLL in O(n) time. This solution takes O(n) time and O(Logn) extra space, but it modifies the given BST. Refernce : [www] .geeksforgeeks.org/find-if-there-is-a-triplet-in-bst-that-adds-to-0/

- anantpushkar009 November 14, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 3 vote

3 words : Unweighted Interval Scheduling. The twist here is that you can only select an interval[j] after having selected interval[i] if it starts after interval[i].end + w[i][j] and is not in the same week as i. Here is the algorithm.

1. Convert all time values to hour for consistency
2. Each site represents an interval. If not given in that format already, create an array containing start , end and index of site form which this interval was taken. Add an interval with start==end with index=start site.
3. Sort that array based on end time
4. Initialise count =0 and start with first interval and do the following till all intervals are exhausted:
4.1 Let i = current interval
4.2 Iteratively find the first j such that interval[j].start>=interval[i].end + w[i][j] and is not in the same week as i
4.3 i = j ; ++count
5. return count

Above is a greedy solution to Unweighted Interval Scheduling. To see the proof why it works: [www] .cse.psu.edu/~asmith/courses/cmpsc465/S12/lec-notes/CMPSC465-S12-Lec-17-greedy-algs.pptx.pdf , slide 9

- anantpushkar009 November 14, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

@szilard, though the chances of that happening are less, but that is a valid point. I missed that one out.

If the median is not on the master server, then we will detect that when the working set becomes empty on the master server, in which case we will have to reassign the master (a server with an IP addr just greater than that) and start over again.

However, if we could possibly find which one server has the median, we could get over this road block, but as in any other distributed system I think it would be as difficult as finding the median itself.

- anantpushkar009 November 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

The first part is straight from Crack the Coding Interview book. The idea is to maintain 2 more stacks to record the info about the min and max in the stack before pushing an element. Here is an implementation with a few test cases:

#include <iostream>
#include <stack>
#define val first
#define id second
using namespace std;
template<class T>
class MinMaxStack{
	int id;
	stack<pair<T,int> > val_s , min_s , max_s;
public:
	MinMaxStack():id(1){}
	void push(T obj){
		++id;
		pair<T,int> x = make_pair(obj , id);
		val_s.push(x);
		
		if(min_s.empty() || obj<min_s.top().val){
			min_s.push(x);
		}
		
		if(max_s.empty() || obj>max_s.top().val){
			max_s.push(x);
		}
	}
	T top(){
		return val_s.top().val;
	}
	T max(){
		return max_s.top().val;
	}
	T min(){
		return min_s.top().val;
	}
	void pop(){
		pair<T,int> x = val_s.top();
		val_s.pop();
		
		if(x.id==max_s.top().id){
			max_s.pop();
		}
		
		if(x.id==min_s.top().id){
			return min_s.pop();
		}
	}
};
int main() {
	MinMaxStack<int> s;
	
	s.push(1);
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.push(4);
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.push(2);
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.push(3);
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.push(10);
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.max()<<" "<<s.min()<<endl;
	return 0;
}

The second part though is unclear about k. Is it a constant? or is it a variable provided with the query. If it is a constant then satisfying the query in O(1) time is possible using BST (set in STL). push and pop become O(logn). Here is an implementation

#include <iostream>
#include <stack>
#include <set>
using namespace std;
template <class T>
class KMedianStack{
	stack<T> val_s;
	multiset<T> prev , next;
	int k;
public:
	KMedianStack(int x):k(x){}
	void push(T obj){
		val_s.push(obj);
		if(prev.size()<k){
			prev.insert(obj);
			return;
		}
		if(obj>*(prev.rbegin())){
			next.insert(obj);
		}else{
			next.insert(*(prev.rbegin()));
			prev.erase(prev.find( *(prev.rbegin()) ));
			prev.insert(obj);
		}
	}
	T top(){
		return val_s.top();
	}
	T get_kth(){
		return *prev.rbegin();
	}
	void pop(){
		T obj = val_s.top();
		val_s.pop();
		
		if(prev.size()<k){
			prev.erase(prev.find(obj));
			return;
		}
		if(obj>*(prev.rbegin())){
			next.erase(next.find(obj));
		}else{
			prev.insert(*(next.begin()));
			next.erase(next.begin());
			prev.erase(prev.find(obj));
		}
	}
};
int main() {
	KMedianStack<int> s(2);
	s.push(3);
	s.push(2);
	
	cout<<s.top()<<" "<<s.get_kth()<<endl;
	
	s.push(6);
	cout<<s.top()<<" "<<s.get_kth()<<endl;
	
	s.push(1);
	cout<<s.top()<<" "<<s.get_kth()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.get_kth()<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.get_kth()<<endl;
	
	return 0;
}

If however, k is given as a variable with the query, I cannot really think of a way to satisfy the query in O(1) time. However , doing that in O(k) time is possible by using BST (set in STL). Here is an implementation:

#include <iostream>
#include <stack>
#include <set>
using namespace std;
template<class T>
class MedianStack{
	int id;
	stack<T> val_s;
	multiset<T> order;
public:
	MedianStack():id(1){}
	void push(T obj){
		val_s.push(obj);
		order.insert(obj);
	}
	T top(){
		return val_s.top();	
	}
	void pop(){
		order.erase(order.find(val_s.top()));
		val_s.pop();
	}
	T get_kth(int k){
		typename multiset<T>::iterator it = order.begin();
		for(int i=0;i<k-1;++i){
			++it;
		}
		return *it;
	}
};
int main() {
	MedianStack<int> s;
	
	s.push(3);
	cout<<s.top()<<" "<<s.get_kth(1)<<endl;
	
	s.push(1);
	cout<<s.top()<<" "<<s.get_kth(1)<<" "<<s.get_kth(2)<<endl;
	
	s.push(4);
	cout<<s.top()<<" "<<s.get_kth(1)<<" "<<s.get_kth(2)<<" "<<s.get_kth(3)<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.get_kth(1)<<" "<<s.get_kth(2)<<endl;
	
	s.pop();
	cout<<s.top()<<" "<<s.get_kth(1)<<endl;
	
	return 0;
}

- anantpushkar009 November 12, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This is very similar to the DP proposed above but implemented using memoization

class Solver:
	def getmindist(self , start , end):
		#print "mindist",start,end
		if end-start<=1:
			return 0
		if self.distlookup[start][end] != -1:
			return self.distlookup[start][end]
		
		target = self.pos[start]
		dist   = 0
		for x in self.pos[start:end]:
			dist += abs(target-x)
		mindist = dist
		
		for i in xrange(start+1 , end):
			delta = abs(self.pos[i]-target)
			
			dist += (i-start)*delta
			dist -= (end-i)*delta
			
			mindist = min(dist , mindist)
		
		#print "mindist",start,end,":",mindist
		self.distlookup[start][end] = mindist
		return mindist
		
	def evalcost(self,index,k ,total):
		#print "eval",index,k,total
		if index<0:
			return 0
		if k==1:
			cost =  self.getmindist(0,index+1) + total
			self.mincost = min(self.mincost , cost)
			return cost
		if self.lookup[index][k]!=-1:
			return self.lookup[index][k]+total
		mincost = 100000000
		for i in xrange(index+1):
			dist = self.getmindist(i,index+1)
			cost = self.evalcost(i-1 , k-1 , total+dist)
			mincost = min(cost-total , mincost)
		
		self.lookup[index][k] = mincost
		return mincost+total
	
	def getmincost(self , pos , k):
		print pos,
		self.pos = pos
		self.mincost = 100000000
		n = len(pos)
		self.distlookup = [[-1 for j in xrange(n+1)] for i in xrange(n+1)]
		self.lookup     = [[-1 for i in xrange(k+1)] for j in xrange(n+1)]
		
		self.evalcost(n-1 , k , 0)
		return self.mincost
s = Solver()
print s.getmincost([1,1,3] , 3)
print s.getmincost([1,2,4] , 2)
print s.getmincost([1,2,5,7] , 2)

- anantpushkar009 November 04, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Idea very similar to the others, but a more precise implementation in Python:

import sys
class Solver:
	def get_min(self , start , end):
		min_dist = sys.maxint
		for i in xrange(start+1 , end):
			dist = self.eval_exp(start , i) + self.eval_exp(i,end)
			min_dist = min(min_dist , dist)
		return min_dist
	
	def eval_exp(self , start , end):
		if self.lookup[start][end]!=-1:
			return self.lookup[start][end]
		
		if self.exp[end-1] == "*":
			self.lookup[start][end] = min( self.get_min(start,end-1) , self.get_min(start , end)+1 , self.eval_exp(start,end-1)+1)
		else:
			self.lookup[start][end] = min( self.get_min(start,end-1)+1 , self.get_min(start , end)+1 , self.eval_exp(start,end-1)+2)
		
		return self.lookup[start][end]
	
	def get_dist(self , s):
		self.exp = s
		n = len(s)
		self.lookup = [[-1 for i in xrange(n+1)] for j in xrange(n+1)]
		for i in xrange(n):
			self.lookup[i][i+1] = 0 if s[i]=='x' else 1
		
		return self.eval_exp(0,n)

s = Solver()
print s.get_dist("*x*")

- anantpushkar009 November 03, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Use dp as follows:
DP1[i][j] = Number of ways to evaluate exp(i,j) as true
DP2[i][j] = Number of ways to evaluate exp(i,j) as false

Now use memoization to evaluate and return DP[0][n].

Following is an implementation in python:

class Solver:
	def count_ways(self , exp):
		print exp,
		self.exp = exp
		n = (len(exp)+1)/2
		self.true_lookup  = [[-1 for j in xrange(n+1)] for i in xrange(n+1)]
		self.false_lookup = [[-1 for j in xrange(n+1)] for i in xrange(n+1)]
		for i in xrange(n):
			self.true_lookup[i][i+1]  = 1 if exp[2*i] else 0
			self.false_lookup[i][i+1] = 0 if exp[2*i] else 1
		
		return self.eval_true(0,n)
		
	def eval_true(self , start , end):
		if self.true_lookup[start][end]!=-1:
			return self.true_lookup[start][end]
		count = 0
		for i in xrange(start+1 , end):
			if self.exp[2*i-1] == '&':
				count += self.eval_true(start,i)*self.eval_true(i,end)
			elif self.exp[2*i-1] == '|':
				count += self.eval_true(start,i)*self.eval_true(i,end)
				count += self.eval_true(start,i)*self.eval_false(i,end)
				count += self.eval_false(start,i)*self.eval_true(i,end)
			elif self.exp[2*i-1] == '^':
				count += self.eval_true(start,i)*self.eval_false(i,end)
				count += self.eval_false(start,i)*self.eval_true(i,end)
		
		self.true_lookup[start][end] = count
		return count
	
	def eval_false(self , start , end):
		if self.false_lookup[start][end]!=-1:
			return self.false_lookup[start][end]
		
		count = 0
		for i in xrange(start+1 , end):
			if self.exp[2*i-1] == '&':
				count += self.eval_true(start,i)*self.eval_false(i,end)
				count += self.eval_false(start,i)*self.eval_false(i,end)
				count += self.eval_false(start,i)*self.eval_true(i,end)
			elif self.exp[2*i-1] == '|':
				count += self.eval_false(start,i)*self.eval_false(i,end)
			elif self.exp[2*i-1] == '^':
				count += self.eval_true(start,i)*self.eval_true(i,end)
				count += self.eval_false(start,i)*self.eval_false(i,end)
		
		self.false_lookup[start][end] = count
		return count

s = Solver()
print s.count_ways([True , "&" , True , "|" , False])
print s.count_ways([True , "&" , False , "|" , False])
print s.count_ways([True , "^" , False , "|" , False])
print s.count_ways([True , "^" , False , "&" , False])

- anantpushkar009 November 03, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

You are right, I cannot think of a way to eliminate the need to remove every line and run the algorithm separately .As in worst case we may not have any angle that intersects less than O(n) lines ,the worst case complexity will be O(n^2 * logn)

- anantpushkar009 November 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Use DP. Maintain an nXm matrix parent where parent[i][j] = least x such that all entries vertically between (x,j) to (i,j) are 1's. Now for a given i, iterate through all contigous series of 1's , updating the area each time. The max area is then returned. You may check you solution at oj.leetcode.com/problems/maximal-rectangle/. Here is my solutio that got accepted. Turn debug to true to see debug messages for more clearity.

class Solution {
public:
	bool debug;
    int maximalRectangle(vector<vector<char> > &matrix) {
        debug = false;
    	int n = matrix.size();
    	if(n==0){
    		return 0;
    	}
    	int m = matrix[0].size();
    	vector<vector<int> > parent(n , vector<int>(m , 0));
    	for(int i=1;i<n;++i){
    		for(int j=0;j<m;++j)if(matrix[i][j]=='1'){
    			if(matrix[i-1][j]=='1'){
    				parent[i][j] = parent[i-1][j];
    			}else{
    				parent[i][j] = i;
    			}
    		}
    	}
    	if(debug){
    		cout<<"Grid : "<<endl;
    		for(int i=0;i<n;++i){
    			for(int j=0;j<m;++j){
    				cout<<matrix[i][j]<<" ";
    			}
    			cout<<endl;
    		}
    		cout<<"=============="<<endl<<"Parent Mat : "<<endl;
			for(int i=0;i<n;++i){
				for(int j=0;j<m;++j){
					cout<<parent[i][j]<<" ";
				}
				cout<<endl;
			}
			cout<<"================="<<endl;
		}
    	
    	int max_a=0 , a , index , p;
    	for(int i=0;i<n;++i){
    		for(int j=0;j<m;++j)if(matrix[i][j]=='1'){
    			if(debug)cout<<"At "<<i<<","<<j<<endl;
    			index = j;
    			p = parent[i][j];
    			while(index>-1 && matrix[i][index]=='1'){
    				p = parent[i][index]<=p ? p : parent[i][index];
    				a = (j-index+1)*(i-p+1);
    				max_a = max_a>a ? max_a : a;
    				if(debug)cout<<"\tindex:"<<index<<" p:"<<p<<" a:"<<a<<" max_a:"<<max_a<<endl;
    				--index;
    			}
    			if(debug)cout<<"================="<<endl;
    		}
    	}
        return max_a;
    }
};

- anantpushkar009 November 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

def handle_removal(s1 , s2):
	n = len(s1)
	m = len(s2)
	flag = False
	
	i=0
	j=0
	
	while j<m:
		if s1[i]!=s2[j]:
			if flag:
				return False
			else:
				flag = True
				i += 1
		else:
			i += 1
			j += 1
	
	if i==n-1:
		return not flag
	
	return True
def handle_replacement(s1 , s2):
	n = len(s1)
	flag = False
	
	i=0
	while i<n:
		if s1[i] != s2[i]:
			if flag:
				return False
			else:
				flag = True
				i += 1
		else:
			i += 1
	
	return flag
def one_edit_distance(s1 , s2):
	n = len(s1)
	m = len(s2)
	
	if n==m+1:
		return handle_removal(s1 , s2)
	
	if n==m-1:
		return handle_removal(s2 , s1)
	
	if n==m:
		return handle_replacement(s1 , s2)
	
	return False

print "cat" , "dog",one_edit_distance("cat" , "dog")
print "cat" , "cats",one_edit_distance("cat" , "cats")
print "cat", "cut",one_edit_distance("cat", "cut")
print "cat", "cast",one_edit_distance("cat", "cast")
print "cat", "at",one_edit_distance("cat", "at") 
print "cat", "acts",one_edit_distance("cat", "acts")
print "xyz" , "xz" ,one_edit_distance("xyz" , "xz")
print "xyz" , "xz" ,one_edit_distance("xyz" , "xz")
print "xyz" , "xyk" ,one_edit_distance("xyz" , "xyk")
print "xy" , "xyz" ,one_edit_distance("xy" , "xyz")
print "xyz" , "xyz" ,one_edit_distance("xyz" , "xyz")
print "xyz" , "xzy" ,one_edit_distance("xyz" , "xzy")
print "x" , "xyz" ,one_edit_distance("x" , "xyz")

- anantpushkar009 November 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 2 vote

This idea builds up from the solution given by yoavzimmerman (by reducing the given problem to unweighted interval scheduling problem by using angles instead of points). The problem with that was, it failed because of cyclic nature of intervals. We can get through that in the following way:
1. If there exists an angle at which no line exists, ie, if there exists theta such that a ray from origin at theta wrt x-axis passes through no line, we rotate x-axis by angle theta. Now we no more have any cyclic nature in the intervals.
2. If the above does not exist, we choose a pair of lines that intersect, say l1 and l2. Both of these can not be together in a solution. So, we apply interval scheduling on L-l1 and L-l2 (both of these set will satisfy the condition#1) and return the max of these two.

To check existence of theta, run a line sweep algorithm[O(n)]. The rest can be done in nlogn time. So, total complexity nlogn.

EDIT : As in the worst case, every ray from origin might intersect O(n) number of lines, the worst case complexity will be O(n^2 * logn)

- anantpushkar009 November 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Though the idea seems nice but I see 2 problems:

1. Minor problem: Interval scheduling works because start<end for all interval. The question no where states that the lines will be given such that start point will subtend smaller angle than the end point wrt x-axis. You seem to have made no effort to ensure that start angle < end angle for all intervals (consider a line segment starting at 0.9,-0.436 and ending at 0.9,0.436: we would rather prefer the other way round for interval scheduling to work here, otherwise, we get stuck in an infinite loop). This can be easily fixed by sorting new_point before appending it to transformed.

2.Major problem: Consider the following example:

#angles in degree
transform(L) = [
			[30 , 150],
			[120,200],
			[190,300],
			[220,310],
			[60,320]
			]

If I understand you correctly, you will end up returning: [ [30,150] , [190,300]], even if the answer is: [[60,320] , [120,200] , [220,310]].

- anantpushkar009 November 02, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Every possible candidate for the staff can be recognized by the tuple (mass , <i>.mass), where <i>.mass is the product of index vector and the mass vector, just like stated in the question.Lets call this tuple profile of the candidate.
Now, given a candidate, its other half will either have the same profile, or the profile with same mass, but the dot product as <i>.reverse(mass). So, iterate through all such candidates, store their profile in a list of dictionaries(dicts based on profile tuple) and return the pair with max length. Here is a simple implementation in python.

def com(mass):
	pos = 0
	m = 0
	n = len(mass)
	for i in xrange(n):
		x = int(mass[i])
		m += x
		pos += x*i
	return (m,pos)
def get_staff(mass):
	n = len(mass)
	lookup = [{} for x in xrange(n/2+1)]
	
	max_len = 0
	max_val = None
	for i in xrange(n):
		for j in xrange(i+1 , n+1):
			l = j-i
			if l>n/2 : break
			cg = com(mass[i:j])
			comp_cg = com(mass[j-1:i+1:-1])
			
			if cg in lookup[l]:
				x = lookup[l][cg][0]
				y = lookup[l][cg][1]
				if max_len<l and (x<i or x>j) and (y<i or y>j): 
					max_val = (mass[i:j] , mass[x:y])
					max_len = l
			
			if comp_cg in lookup[l]:
				x = lookup[l][comp_cg][0]
				y = lookup[l][comp_cg][1]
				if max_len<l and (x<i or x>j) and (y<i or y>j): 
					max_val = (mass[i:j] , mass[x:y])
					max_len = l
				
			lookup[l][cg] = (i,j)
	
	return max_val

print get_staff("9913125114123199")

- anantpushkar009 October 31, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Inplace, O(1) memory , O(n) time code in C:

#include <stdio.h>
#include <string.h>
void internal_reverse(char *str){
	char c;
	int n = strlen(str) , i=0 , j , next;
	while(i<n){
		if(isspace(str[i])){
			++i;
		}else{
			j = i;
			while(j<n && !isspace(str[j])){
				++j;
			}
			next = j--;
			while(i<j){
				c = str[i];
				str[i] = str[j];
				str[j] = c;
				
				++i;
				--j;
			}
			i=next;
		}
	}
}
int main(void) {
	char str[100]="the \tboy   ran";
	printf("Original String : %s\n",str);
	internal_reverse(str);
	printf("Processed string : %s\n",str);
	return 0;
}

Another approach would be to use regex to split and reverse string.Code in python:

import re
def internal_reverse(s):
	words = re.split("\s*" , s)
	whitespace = re.split("\S*" , s)
	out_s = [ words[0][::-1] ]
	n = len(words)-1
	for i in xrange(n):
		out_s.append(whitespace[i+1])
		out_s.append(words[i+1][::-1])
 
	return "".join(out_s)
print internal_reverse(str(raw_input()))

- anantpushkar009 October 30, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

This can be done in linear time using memoization. Here is a code in python:

class Solver:
	def __init__(self , d):
		self.dict = d
	
	def __check__(self , index):
		if index >= self.size:
			return index==self.size
		if self.lookup[index]!=-1:
			return self.lookup[index]==1
		for word in self.dict:
			if self.txt[index:].startswith(word) and self.__check__(index+len(word)):
				self.lookup[index] = 1
				return True
		self.lookup[index]==0
		return False
				
	
	def is_concatenation(self,s):
		self.size = len(s)
		self.txt  = s
		self.lookup = [-1 for _ in xrange(self.size)]
		return self.__check__(0)

Everytime __check__ is called on an index, it gets computes the valueonly if it was not computed earlier. This means that any computation happens at max once for very index, giving a linear time complexity.

- anantpushkar009 October 29, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 4 vote

Clearly, there are two kinds of costs involved here:1. cost of computation happening within a server , 2. Cost of communicating between servers.
Also worth noting is that the second cost is going to be the bottleneck here, so we better focus on saving bandwidth, and simultaneously, try not to over strain each server individually. Inspired by the classic quick select algorithm, I propose something that we could call distributed quick select algorithm:
1. Select a control/master server. Probably the one with the least IP addr. Standard protocol, not a big deal.[something similar to the spaning tree protocol would do]
2. Control/master server selects a pivot from its own array and broadcasts the value to the est
3. Each slave server partitions its own array based on this pivot value and sends the left , central and right count to the master server.
4. Based on the total number of elements in each slave server and those that the master itself has, master selects one of the left, or right partitions to work on . Each of the slave servers will now be working on this partition.
5. if total_left_count>=n/2: work on left partitions;repeat 2 to 4
else If total_left_count+total_central_count<=n/2: pivot is the median;
Else work on right partition; repeat 2 to 4

Total internal computation :Linear
Total number of network communications: nlogn [we will have to run an average logn cycles, each cycle taking linear number of communications]
The second cost would improve if we used a multicast network.

- anantpushkar009 October 28, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 1 vote

Quick select seems to be the appropriate algorithm for the first part.

And for the follow up, we can scale up quick select just the way we do with merge sort to get External Merge Sort. Lets call this External Quick Select. Following is an implementation in C++. I have taken the stream from file named "numlist". Also I have assumed that the numbers can be accommodated in long long unsigned int data type.

#include<iostream>
#include<cstring>
#include<cstdio>
#include<deque>
#include<queue>
#include<utility>
#include<vector>
#include<climits>
#include<algorithm>
#include<stack>
#include<fstream>
#include<ctime>
#include<sstream>
using namespace std;
bool debug=false;
typedef long long int lld;
typedef unsigned long long int llu;
string get_filename(){
	stringstream ss;
	ss<<rand()%100000;
	return ss.str();
}
llu external_quick_select(fstream &fptr , llu k , string fname){
	llu pivot , num;
	fptr>>pivot;
	llu left_count=0 , right_count=0 , center_count=0;
	
	fstream left;
	string left_name = get_filename();
	left.open(left_name , fstream::in | fstream::out | fstream::trunc);
	
	fstream right;
	string right_name = get_filename();
	right.open(right_name , fstream::in | fstream::out | fstream::trunc);

	while(fptr>>num){
		if(num<pivot){
			++left_count;
			left<<num<<endl;
		}else if(num>pivot){
			++right_count;
			right<<num<<endl;
		}else{
			++center_count;
		}
	}
	
	if(k<=left_count){
		fptr.close();
		if(fname!="numlist")remove(fname.c_str());
		
		right.close();
		remove(right_name.c_str());
		
		left.close();
		left.open(left_name);
		
		return external_quick_select(left , k , left_name);
	}else if(k>left_count+center_count+1){
		fptr.close();
		if(fname!="numlist")remove(fname.c_str());
		
		left.close();
		remove(left_name.c_str());
		
		right.close();
		right.open(right_name);
		
		return external_quick_select(right , k-left_count-center_count-1,right_name);
	}else{
		fptr.close();
		if(fname!="numlist")remove(fname.c_str());
		
		left.close();
		remove(left_name.c_str());
		
		right.close();
		remove(right_name.c_str());
		
		return pivot;
	}
}
int main(int argc , char **argv)
{
	if(argc>1 && strcmp(argv[1],"DEBUG")==0) debug=true;
	srand(time(NULL));
	fstream fptr;
	fptr.open("numlist");
	llu k,num;
	cin>>k;
	
	cout<<external_quick_select(fptr,k,"numlist")<<endl;
	
	return 0;
}

- anantpushkar009 October 26, 2014 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

There is a different version of KMP that is a little space inefficient, but does the trick in O(n) time anyways. It involves creating a DFA which can be run on the input text in linear time to identify if the given pattern in a substring. I could code it in 5 min...so hopefully doing so in an interview in 20 min should not be a problem.

I have not checked the code very exhaustively, so please let me know if there are any bugs.

#include<iostream>
#include<vector>
using namespace std;
int main()
{
	string text,pat;
	cin>>text>>pat;
	vector<vector<int> > dfa(pat.size(),vector<int>(1<<8 , 0));
	dfa[0][pat[0]-0]=1;
	int i=0,j=1;
	while(j<pat.size()){
		for(int q=0;q<dfa[j].size();++q){
			if(pat[j]-0==q)dfa[j][q]=j+1;
			else dfa[j][q]=dfa[i][q];
		}
		i=dfa[i][pat[j]-0];
		++j;
	}
	for(i=0,j=0;i<text.size();++i){
		j=dfa[j][text[i]-0];
		if(j==pat.size()){
			cout<<"found"<<endl;
			break;
		}
	}
	return 0;
}

input format :
[text]
[pattern]

eg:
bottle
tl

- anantpushkar009 December 13, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

Firstly, the elements in the array can belong to a finite set {0-9}. So, instead of sorting the array, just a class that can handle such frequency distribution will do. Creation of an instance and population of data will take place in linear time
Secondly, allowing 0 in the input array makes sense only if leading zeros were not allowed, otherwise, a linear time pre-processing can get rid of all the zeros. So, I have made a separate function to handle such a case which is invoked only if leading zeros are not allowed

Input format :
t //number of test cases
{//each test case
n//number of elements in array
za//=1 if leading zeros are not allowedother wise !=1
<ai> ith element of array
}
test cases that I have tested upon :
6
5 0
1 2 7 8 9
5 1
0 2 7 8 9
5 1
1 8 2 9 7
5 1
0 0 0 6 7
5 0
0 0 0 6 7
10 1
9 2 7 5 7 2 1 5 4 6
Expected output:
207//as given
287//208+79 : leading zero not allowed
207//just another permutation of given test case
670//600+70 : leading zero not allowed
13//006+07 : leading zero allowed
37146//12567+ 24579

Code :

#include<iostream>
#include<cstring>
#include<cstdio>
#include<vector>
using namespace std;
bool debug=false;
class Set{
	vector<int> freq;
	int maxDigit,swapDigit,swap;
	bool zero_allowed;
	void manage_zero(){
		if(debug)cout<<"max digit : "<<maxDigit<<endl;
		if(zero_allowed){
			int sum_freq=0;
			for(int i=1;i<10;++i)sum_freq+=freq[i];
			if(swap==0 && sum_freq==2 && freq[0]!=0){
				if(debug)cout<<"Swapping in 0 from "<<maxDigit<<endl;
				swap=1;
				swapDigit=maxDigit;
				maxDigit=0;
			}
			if(swap==1 && freq[0]==0){
				if(debug)cout<<"Swapping back "<<swapDigit<<endl;
				maxDigit=swapDigit;
				swap=2;
			}
		}
	}
public:
	Set(int n,bool za):
	freq(vector<int>(10,0)),
	maxDigit(9),
	zero_allowed(za),
	swap(0){
		int num;
		while(n--){
			scanf("%d",&num);
			++freq[num];
		}
		while(maxDigit>=0 && freq[maxDigit]==0)--maxDigit;
	}
	int getMax(){
		while(maxDigit>=0 && freq[maxDigit]==0)--maxDigit;
		manage_zero();
		if(maxDigit==-1){
			if(debug)cout<<"Returning due to empty set"<<endl;
			return -1;
		}
		--freq[maxDigit];if(debug)cout<<"freq["<<maxDigit<<"] being reduced to "<<freq[maxDigit]<<endl;
		return maxDigit;
	}
	bool isEmpty(){
		while(maxDigit>=0 && freq[maxDigit]==0)--maxDigit;
		manage_zero();
		if(maxDigit==-1){
			if(debug)cout<<"Returning due to empty set"<<endl;
			return true;
		}
		return false;
	}
};
int main(int argc , char **argv)
{
	if(argc>1 && strcmp(argv[1],"DEBUG")==0) debug=true;
	int t,n,za;scanf("%d",&t);
	long long int sum,place_value;
	while(t--){
		scanf("%d %d",&n,&za);
		Set s(n,za==1);
		place_value=1;
		sum=0;
		while(!s.isEmpty()){
			sum+=s.getMax()*place_value;if(debug)cout<<sum<<endl;
			if(s.isEmpty())break;
			sum+=s.getMax()*place_value;if(debug)cout<<sum<<endl;
			place_value*=10;
		}
		printf("%lld\n",sum);
	}
	return 0;
}

- anantpushkar009 December 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think disturbing the struct Node least would be better. Also, I would avoid making helper functions until extremely necessary. So, first make a hash map(unordered_set in this case) that maps id to index in the list (vector in this case) and identify the root while doing this (O(n))

Then create a data structure to store set of children of each parent (I have used vector<stack<int> > child for this purpose) . Again O(n).

Now iteratively run DFS on this . Recursion can be used here, but I avoid making unnecessary helper functions. Again O(n)

Total O(n)

#include<iostream>
#include<cstring>
#include<cstdio>
#include<vector>
#include<stack>
#include<unordered_map>
using namespace std;
bool debug=false;
struct Node{
	int id; 
	int parent; 
	int weight; 
};
void printSubTreeWeight(vector<Node> nodes){
	int size=nodes.size(),root,index,temp;
	vector<int>val(size,0);
	vector<stack<int> > child(size);
	unordered_map<int,int> indexOf;
	vector<bool> lookup(size,false);
	for(int i=0;i<size;++i){
		indexOf[nodes[i].id]=i;
		if(nodes[i].parent==0)root=i;
	}
	for(int i=0;i<size;++i)if(nodes[i].parent!=0){
		child[indexOf[nodes[i].parent]].push(i);
	}
	stack<int> s,sum;
	s.push(root);sum.push(nodes[root].weight);
	lookup[root]=true;
	while(!s.empty()){
		index=s.top();
		if(!child[index].empty()){
			int c_index=child[index].top();
			if(!lookup[c_index]){
				s.push(c_index);
				sum.push(nodes[c_index].weight);
				lookup[c_index]=true;
			}
			child[index].pop();
		}else{
			val[index]=sum.top();
			temp=sum.top();sum.pop();
			if(!sum.empty()){
				temp+=sum.top();sum.pop();
				sum.push(temp);
			}
			s.pop();
		}
	}
	for(int i=0;i<size;++i)cout<<val[i]<<" ";
	cout<<endl;
}
int main(int argc , char **argv)
{
	if(argc>1 && strcmp(argv[1],"DEBUG")==0) debug=true;
	vector<Node> v;
	int t,n;cin>>t;
	while(t--){
		cin>>n;
		v=vector<Node>(n);
		for(int i=0;i<n;++i)cin>>v[i].id>>v[i].parent>>v[i].weight;
		printSubTreeWeight(v);
	}
	return 0;
}

I have tested it on the following test cases :
Input :
2
5 10 30 1 30 0 10 20 30 2 50 40 3 40 30 4
7 100 0 2 21 100 5 26 100 7 29 100 8 3 100 12 1 26 5 4 26 6
Output:
1 20 2 3 7
45 5 18 8 12 5 6

- anantpushkar009 December 12, 2013 | Flag Reply

Page:

CareerCup

anantpushkar009

Books

Videos

Resume Review

Mock Interviews