Cerberuz
BAN USER1) Use RSync protocol for file synchronization between clients and client-server.
2) Store file blocks of equal size instead of file. Have a file block server which keeps track of file blocks and SHA-256 hash of each block stored.
3) Use Amazon S3 to store file blocks.
4) Each client should be identified by it's unique root namespace.
5) Store client metadata in a DB (MySQL) Table. Each row in DB table will represent a particular file. Table attributes should be list of file blocks, root namespace of client who owns this file, a relative path of file in namespace to get the location of file in client's dropbox installation folder. Store other user metadata like settings, account configuration, access level etc. also in this DB in some other table.
6) Have a metedata server to fetch result from DB.
7) Have multiple instances of metadata and file block server to handle large number of requests. Amazon S3 will handle your file blocks.
8) Use memcache and load balancer with metadata server for efficiency.
To upload a file, client will split the file into blocks of equal size (4MB) and client will talk to metadata server to send the information (hash of file blocks) about the file to upload. If any file block is not already found in Amazon S3 then block server will tell metadata server about it. Metadata server will tell client to send those file blocks to block server. Client will send those file blocks to block server. Once block server stores file blocks in Amazon S3, metadata server adds the entry for that file in MySQL DB representing an update in client namespace.
To sync file to other clients of a user, metadata server will inform the clients about the update in client namespace. Clients will ask metadata server about newly added file. Metadata server will send the list of hashes of file blocks of the new file.Then the clients will talk to block server, give the list of file block hashes, retrieve the corresponding file blocks and combine them to generate the file.
AND THIS IS HOW EXACTLY DROPBOX WORKS.
(root,7,1) will return 5
(root,7,2) will return 4
Let the parameters are root_node, target_node, K (distance)
Traverse the tree - recursive calls (inorder traversal)
If a node is the target node then print all the child nodes at distance K from target node.
While returning from a recursive call, return the distance of node from target node. If target is not yet found, return -1.
Use the distance returned to print the nodes at (k-returned_distance-2) in the other subtree via another recursive call.
Note:
If returned value to current node is X (X != -1) then target node is at distance x+1 in that subtree.
In other subtree, required nodes will be at distance (K-X-1) from current node or we can say
required nodes will be at distance (K-X-2) from root of other subtree as root of other subtree is
child of current node.
/**
* returns distance of node from target if target has been traversed, else -1
*/
int printNodesAtDistanceKSolve(node *root, node *target, int k)
{
if (root == NULL) {
return -1;
}
if (root == target) {
// target node is found, print all nodes at depth k in subtree rooted at target
printNodesAtDepthKInTree(target, k);
return 0;
}
int left_dist = printNodesAtDistanceKSolve(root->left, target, k);
if (left_dist != -1) {
// target was found in left subtree
if (left_dist+1 == k) {
// current node is at distance k from target node
printf("%d\n", root->data);
} else if (left_dist < k) {
/*
* Nodes at depth (k-left_dist-2) in right subtree will be at distance k from target node,
* print them
*/
printNodesAtDepthKInTree(root->right, k-left_dist-2);
}
return left_dist + 1;
}
int right_dist = printNodesAtDistanceKSolve(root->right, target, k);
if (right_dist != -1) {
// target was found in right subtree
if (right_dist+1 == k) {
// current node is at distance k from target node
printf("%d\n", root->data);
} else if (right_dist < k) {
/*
* Nodes at depth (k-right_dist-2) in left subtree will be at distance k from target node,
* print them
*/
printNodesAtDepthKInTree(root->left, k-right_dist-2);
}
return right_dist + 1;
}
/*
* target node is not found in this subtree, it will be traversed again later in other call from it's
* ancestor node whose subtree has target node
*/
return -1;
}
This can be done using a top-down recursive approach.
Just iterate over the string.
Keep track of the length of expanded string.
If expanded string's length >= k then reduce the problem into a subproblem X with smaller k.
Solve subproblem X.
e.g.
s = ab2c3
k = 10
a -> total_len = 1
ab -> total_len = 2
ab2 -> total_len = 4
ab2c -> total_len = 5
ab2c3 -> total_len = 15
=> total_len > k, kth character will be in string "ababcababcababc"
=> "ababc" is repeating, so we can reduce the problem to s = "ababc" and k = k % 5 where 5
is length of s = "ababc".
If modulus operation returns 0 e.g. in this case 10 % 5 == 0, then we have to pass length of s i.e. 5 to subproblem. This is due to 0 based index of string.
Note:
In code we have to take care of using long long for total_len.
We don't need to create new substring in recursive call.
In each subproblem, k will reduce along with total_len according to modulo operation. So problem will get solved in few recursive calls.
char findKthChar( string &s, int k )
{
int sl = s.length();
int i = 0;
long long total_len = 0;
while (i < sl) {
if (isalpha(s[i])) {
// got alphabet
total_len++;
if (total_len == k) {
return s[i];
}
i++;
} else {
// got number (parse complete number)
int n = 0;
while (i < sl && !isalpha(s[i])) {
n = n*10 + (s[i]-'0');
i++;
}
long long next_total_len = total_len*n;
if (k <= next_total_len) {
int pos = k % total_len;
if (!pos) {
pos = total_len;
}
return findKthChar(s, pos);
} else {
total_len = next_total_len;
}
}
}
// invalid k
return -1;
}
Let A[0..n-1] and B[0..m-1] are two arrays.
Segments of A or B = set of subarrays between two intersection points including subarray before first intersection point and subarray after last intersection point.
So if there are K intersection points then we have (K+1) segments each of A and B.
Iterate over all segments, if segment of A has greater sum then choose it else choose segment of B.
We can iterate over segments in O(m+n) time.
sum_subarray_a = 0;
sum_subarray_b = 0;
max_sum = 0;
i = 0;
j = 0;
while ( i < n && j < m ) {
if (a[i] > b[j]) {
while( j < m && a[i] > b[j] ) {
sum_subarray_b += b[j];
j++;
}
} else if ( a[i] < b[j] ) {
while ( i < n && a[i] < b[j] ) {
sum_subarray_a += a[i];
i++;
}
} else {
max_sum += max(sum_subarray_a, sum_subarray_b) + a[i];
// Why adding a[i] in last statement ? Because we need to add intersection point also in max_sum.
sum_subarray_a = sum_subarray_b = 0;
i++;
j++;
}
}
max_sum += max( sum_subarray_a + (sum of a[i] to a[n-1]),
sum_subarray_b + (sum of b[j] to b[m-1]));
//max_sum is the answer
We cannot solve this by using four equations as 4 equations in 4 variables are solvable iff none of the equation is linear combination of others.
Alternative logical approach :
Send the horse up the hill ( now horses can be trained easily )
Go down and call/whistle the horse.
Note the time taken by horse to come down i.e HT.
Now send the horse up again.
When the horse is at the top, note the time of bottom clock i.e. T1. Now, start climbing the hill. When you reach at the top, ride on the horse down the hill.
Note the time at the bottom clock again i.e. T2
=> Time taken by you to climb up the hill and come down on horse = T2-T1
=> time taken by you to climb the hill = T2-T1-time taken to come down on horse = X = T2-T1-HT
Now start from bottom of hill. Note down the time of clock there i.e T. Climb up and set the clock at
the top to T+X.
"interviewer stressed on not processing all the words in the dictionary."
That's the key point in this problem. I read algo for this problem long back somewhere, it was like this :
Map each alphabet with a prime number and product of prime values of letters in a word will act as hash. Now search in the given dictionary this hash value. If dictionary has been implemented as trie then search in the trie and break from a particular word if its prime product exceeds the corresponding value for the given word.
You need to take care of oveflow by carefully chosing prime values for letters, so higher frequency letters should be given lower prime value.
Randomization can be done like this :
d = random number => tree depth
So we will traverse till depth d, at each node we will generate a random boolean value :
true : move left
false : move right
At last we'll end up on a random node. But the challenge here is uniform distribution of random node selection over all the nodes which looks quiet difficult to me.
c1 = first unique character
c2 = second unique character
sp1 = first position of c1
ep1 = last position of c1
sp2 = first position of c2
ep2 = last position of c2
Initialize c1=s[0]; sp1=ep1=0
c2=-1
For each s[i]:
if s[i] != c1 AND c2 is not initialized i.e. c2 == -1 :
//initialize c2,sp2,ep2
else if s[i] == c2:
//update end position of c2
else if s[i] == c1:
//we define second unique character as the last character found
//in the substring of 2 unique characters so far
//so we have to swap c1 and c2 and their start and end positions
//s[i] i.e. c1 becomes c2 and c2 becomes c1
//Do required variable swappings
else if s[i] != c1 and s[i] != c2 i.e. s[i] is a new character:
//c2 will become c1 while s[i] will become c2
//Now here is the corner case due to which we need both start and
//end positions of c1 and c2
//Our new substring containing 2 unique characters must start from
//the point after which there is no occurence of c1 till i i.e
//after the last position of c1
//So make required updates in sp1, sp2, ep1, ep2
In this algo, keep track of the start and end positions of current substringof 2 unique characters +
keep track of max length such substring found so far.
Complexity : O(n)
//Code
char * func( char *s, int len ) {
char c1, c2;
int sp1, sp2, i;
int ep1, ep2;
sp2 = ep2 = c2 = -1;
c1 = s[0];
sp1 = ep1 = 0;
int ans_sp, ans_ep; //starting and ending point of overall answer
int cur_ans_sp, cur_ans_ep;//starting and ending point of current required substring at any instant
ans_sp = ans_ep = 0;
cur_ans_sp = cur_ans_ep = 0;
for( i = 1; i < len; i++ ) {
if( s[i] != c1 && c2 == -1 ) {
c2 = s[i];
sp2 = ep2 = i;
cur_ans_ep++;
} else if( s[i] == c1 ) {
c1 = c2;
sp1 = sp2;
ep1 = ep2;
c2 = s[i];
sp2 = ep1+1;
ep2 = i;
cur_ans_ep++;
} else if( s[i] == c2 ) {
ep2 = i;
cur_ans_ep++;
} else if( s[i] != c1 && s[i] != c2 ) {
c1 = c2;
sp1 = ep1+1;
ep1 = ep2;
c2 = s[i];
sp2 = ep2 = i;
cur_ans_sp = sp1;
cur_ans_ep = ep2;
}
if( cur_ans_ep-cur_ans_sp+1 > ans_ep-ans_sp+1 ) {
ans_sp = cur_ans_sp;
ans_ep = cur_ans_ep;
}
}
for( i = 0; i < ans_ep-ans_sp+1; i++ ) {
s[i] = s[ans_sp+i];
}
s[i] = '\0';
return s;
}
- Cerberuz April 14, 2013This explains it better, can be used for swapping block of bits :
graphics.stanford.edu/~seander/bithacks.html#SwappingBitsXOR
Swapping just two bits would be too simple as this can be done simply by checking if they are same or not. If not same then set 1 to 0 and 0 to 1 simply by xoring with ( 1<<position ).
Suppose positions are p and q and integer is n:
! (((n & (1 << p)) >> p) ^ ((n & (1 << q)) >> q)):
bits are same, no need to swap
else
n ^= ( 1<<p )
n ^= ( 1<<q )
Java has bitwise operators.
As already stated set_hash() function just set the flag ( which is a bit ) to true if a character has been seen in the string.
Suppose *ch = 'a', ascii value is 97 so base = 64 and hash = hash2
=> *ch-base = 'a'-64 = 97-64 = 43
=> we need to check/set the 43rd bit from right in hash2 variable to true
=> 1ULL << (*ch-base) = 1ULL << 43 that is multiple 1 by 2^43, as this will overflow normal 32 bit integer so we take 1ULL instead of simple 1.
=> hash2 & ( 1ULL << 43 ) will take bitwise AND of the two values and will be true only if 43rd bit from right in hash2 is 1 as 43rd bit from right in ( 1ULL<<43 ) is already 1.
=> If the bit is not 1, set it to be 1 using hash2 = hash2 | ( 1ULL << 43 )
Assuming characters are in ascii range : [0,127], we can use a 64bit unsigned integer with each of its bit representing the flag for ascii character. So instead of using a boolean array of size 128 we can use two 64 bit unsigned integers.
For each character in string, if its corresponding hash-bit is not set -> character has not been seen yet and is unique till that point, so set that hash-bit( mark the character as seen ).
For each character, if its corresponding hash-bit is set -> its a duplicate, set the character as null.
Finally return the string considering only non-null characters.
unsigned long long set_hash( char *ch, int base, unsigned long long hash ) {
if( ( hash & ( 1ULL << (*ch-base) ) ) == 0 ) {
hash |= ( 1ULL << (*ch-base) );
} else {
*ch = 0;
}
return hash;
}
char * func( char *s, int len ) {
unsigned long long hash1 = 0;
unsigned long long hash2 = 0;
int i;
for( i = 0; i < len; i++ ) {
if( s[i] >= 0 && s[i] <= 63 ) {
hash1 = set_hash( &s[i], 0, hash1 );
} else if( s[i] >= 64 && s[i] <= 127 ) {
hash2 = set_hash( &s[i], 64, hash2 );
}
}
int new_len = 0;
for( i = 0; i < len; i++ ) {
if( s[i] != 0 ) {
s[new_len++] = s[i];
}
}
s[new_len] = '\0';
return s;
}
Maintain a map user_map[key]
key = userid
user_map[key] = page sequence queue of max size 3
Additionally maintain a map visit_count_map[key]
key = page sequence queue of size 3
visit_count_map[key] = visit count for page sequence queue denoted by key
For each line in log file:
user = userid
page = pageid
if user_map contains user:
if user_map[user].size() < 3:
user_map[user].push(page)
else:
user_map(user).pop()
user_map(user).push(page)
if( user_map[user].size() == 3:
visit_count_map[ user_map[user] ]++;
else
user_map[user].push(page)
print the key with maximum value in visit_count_map[]
Hash table can be used instead of map.
Time Complexity : O[ (no of lines in log file)*log(n)*log(3*k^3) + k^3 ] for map
O[ (no of lines in log file)*log(3*k^3) + 3*k^3 ] for perfect hash table
Space Complexity : O(3*k^3) + O(3*n)
If next day price > today's price and no stock in possession => buy
else if next day price < today's price and stock in possession => sell
void maxProfit(int prices[], int len) {
int i;
int buy = -1;
for( i = 0; i < len-1; i++ ) {
if( buy == -1 && prices[i] < prices[i+1] ) {
buy = i;
printf( "Buy on day %d\n", i );
} else if( buy != -1 && prices[i] > prices[i+1] ) {
profit += prices[i]-prices[buy];
buy = -1;
printf("Sell on day %d\n", i );
}
}
if( buy != -1 ) {
profit += prices[len-1]-prices[buy];
printf("Sell on day %d\n", i );
}
// profit = max profit acquired
}
Alternatively, it can solved by finding all disjoint longest increasing subarrays. This can be done in O(n). Buy on first element of subarray and sell on last element of subarray for each of these subarrays.
e.g. 2 1 3 6 8 2 4 1 9
Longest increasing disjoint subarrays are { 1, 3, 6, 8 }, { 2,4 }, { 1,9 }
Buy at 1 - sell at 8
Buy at 2 - sell at 4
Buy at 1 - sell at 9
Condition for digit combination to appear same when inverted :
a) for every 6 at distance x from left there must be a 9 at distance x from right and vice versa. [ x < 4 ]
b) rest of the digits should be 0,1 or 8.
Now solve it case by case i.e using 0 time 6 and 9, 1 time 6 and 9, two times 6 and 9 ....
//Following is INCORRECT Approach
Number must be formed by 0,1,8 to look same on both sides.
=> Total number of 8 digit combinations possible using 0,1,8 = 3^8
=> Total number of 8 digit combinations = 10^8
Answer = (3^8)/ (10^8)
For player A : he can pick up coin in first place or last place, for each of these case player B can further pick from first or last place from remaining coins. But for A to win/maximize coins, coins collected by B should be minimum while that of A should be maximized.
If A selects coin[i], then B can choose coin[i+1] or coin[array_size]
If A selects coin[array_last], then B can choose coin[i] or coin[array_last-1]
We will simulate such that every function call will start from A's turn. This will give us the given recursive function.
This is a recursive version, alternatively it can be done using DP.
function max_coin( int *coin, int start, int end ):
if start > end:
return 0
int a = coin[start] + min( max_coin( coin, start+2,end ), max_coin( coin, start+1,end-1 ) )
int b = coin[end] + min( max_coin( coin, start+1,end-1 ), max_coin( coin, start,end-2 ) )
return max(a,b)
@vik, its fine as i am doing dp[0] = 1 ( true ), though there is a bug in the code pointed out by CodeBit.
@CodeBit, I usually give a pseudo code based on my ideas and improve it on the basis of comments and corner test cases suggested by others. Just think of an interview scenario were you have to give the solution/code within 10-20 minutes. You will usually put your idea/code before the interviewer with basic testing and he will ask you to improve it. Giving a perfect solution in single shot means either you are genius ( which i am not ) or you already knew the exact solution beforehand ( sometimes it occurs in my case, sometimes it doesn't ).
I tried to update my current solution but getting some server side exception, so will do it later.
Calculate hash of each of the string using a hash functions that does not take the position of character into account. Answer is number of distinct hash values obtained.
- Cerberuz June 30, 2015Assuming the string has only small case alphabets you can use calculate charCount[26] for each of the string i.e. count of each alphabet considering 'a' is mapped to position 0 in charCount. Answer will be number of distinct charCount arrays generated. Complexity will be O(n*m*26) where n is number of strings, m is string length.