## Google Interview Question for Software Engineer / Developers

• 7

Country: India
Interview Type: In-Person

Comment hidden because of low score. Click to expand.
16
of 20 vote

you can solve this in O(n) using a combination of trie and linked list. The leaf node of a trie maintains a flag to record duplicate urls and pointer to a node in a link list. If you encounter a new url, add a node to the head of the linked list and set the pointer in the trie. Whenever you encounter a url that is already in the trie, if the flag is not set, then set the flag, delete the node from the linked list and set pointer to null. If the flag is already set, then ignore and read the next url.
After processing all the urls, the link list should only contain the unique urls and the node at the tail is the first unique url from the list. For n urls, inserting urls into the trie in O(n) and link list operations are all constant time. The node could just keep the index of the url in the list so that we don't have to store urls in the link list as well.

Comment hidden because of low score. Click to expand.
-1
of 1 vote

i thought tries have insertion time o(nlogn) for n elements.
are you sure its o(n), its a tree structure...

Comment hidden because of low score. Click to expand.
1
of 1 vote

The trie insertion time is k, where k is the length of the string being inserted. So the total time is O(nk). Since the length of the URL's doesn't depend on n, though, and can be expected to be relatively constant, this is O(n).

Comment hidden because of low score. Click to expand.
1
of 1 vote

Why not just use a hashtable if your storage is going to be O(n) anyway though?

Comment hidden because of low score. Click to expand.
0

Storage is going to be O(n) -- but a hash table is not as space efficient as it has to store the entire string in case of collision, but in all partial matches are collapsed together, no?

Comment hidden because of low score. Click to expand.
0

Incorrect soln will not be O(n) as you would be deleting nodes from link list...parsing till node will be extra time.

Comment hidden because of low score. Click to expand.
1
of 1 vote

To answer the question of above user:
There is no extra time needed to "parse till the node to be deleted".
The following code is my implementation of the same idea as that of the original poster,
the linked list is combined with the trie,
once you have finished travelling to end of an url in the trie,
you would have the node in the linked list to be deleted.

``````/* The fields 'parent' and 'child' are for the trie.
* The fields 'prev' and 'next' are for the linked list.
* I presume that the character-set of URL is {'a' - 'z', '.', '/', ':'}.
* In real interview, need to discuss the assumption with your interviewer.
*/
class Node
{
public Node parent = null;
public Node[] child = new Node[29];
public Node prev = null;
public Node next = null;
public char data = ' ';
public int count = 0;
}

class G105
{
public static String firstUnique(String[] array)
{
Node root = new Node();
Node first = null;
Node last = null;
for (String s : array)
{
Node current = root;
/* Modify the trie */
for (char c : s.toCharArray())
{
int index = 0;
if (':' == c) {index = 28;}
else if ('/' == c) {index = 27;}
else if ('.' == c) {index = 26;}
else {index = c - 'a';}

if (null == current.child[index])
{
Node next = new Node();
next.parent = current;
next.data = c;
current.child[index] = next;
}
current = current.child[index];
}

/* Modify the linked list */
current.count++;
if (1 == current.count)
{
if (null == first) {first = last = current;}
else
{
current.prev = last;
last.next = current;
last = current;
}
}
else if (2 == current.count)
{
if (current == first)
{
first = current.next;
if (null != first) {first.prev = null;}
}
else
{
Node prev = current.prev;
prev.next = current.next;
if (null != current.next)
{
current.next.prev = prev;
}
}
}
}

StringBuffer sb = new StringBuffer();
Node current = first;
while (current != root)
{
sb.append(current.data);
current = current.parent;
}
return sb.reverse().toString();
}

public static void main(String[] args)
{
String[] urls = {
"abc.amazon.com",
"abc.yahoo.com",
"abc.yahoo.com",
};
System.out.println(firstUnique(urls));
}
}``````

Comment hidden because of low score. Click to expand.
0

This has to be a doubly linked list, for the removal of nodes to be O(1) , right ?

Comment hidden because of low score. Click to expand.
0

Even though delete operation for a single and the doubly linked list is O(1) operation once we have access to its current and previous pointers, finding the elements for deletion is still O(N). Therefore I dont think overall delete operation is O(1) operation when finding that element for removal is necessary

Comment hidden because of low score. Click to expand.
9
of 9 vote

I believe you used hash(or trie or something similar) to store the counts of each URL.
To make it one traversal, you need only to use a doubly linked list to link all the unique ones (either link the items in the hash table or do it separately) so far as you traverse through.
Remove the URL from the list if its count goes over 1;
and of course connects its predecessor and successor (which is actually implied when I say "remove")
So after one traversal, the first one of your linked list is the desired one.
Extra spaces for two pointers for each URL are required for this solution.

Comment hidden because of low score. Click to expand.
0

suppose you are in the middle of traversing thru the list, u have a half prepared link list, now u get another url, so what u do?? u traverse thru the link list untill u find the match. But how u r going to do this "traverse thru" for each element on "overall O(n) complexity"??

Comment hidden because of low score. Click to expand.
0

To remove or to insert a node to this list is up to the count of the URL in the hash table.

Comment hidden because of low score. Click to expand.
0

If we have a hash table then why do we require a linked list. Can't we just traverse the array and keep on checking the count for each one??

Comment hidden because of low score. Click to expand.
0

You may consider an array is a special type of hashtable which integrates (partly) the doubly linked list. But in practice, I believe a real hast table is required and each of its item is pointing to a doubly linked list.You can tell the difference between these two different models

Comment hidden because of low score. Click to expand.
0
of 0 vote

What does unique mean? appearing exactly once?

Hash the url and use the hash to check for dupes, should be faster than plain old string matching.

Comment hidden because of low score. Click to expand.
0

millions of url i dont think hash is a good idea

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think this might work in linear time.
Assign a prime number for each character (letters, digits and special characters). Eg: 'a'--> 2, 'b'-->3, '1'-->17, '#'--> 29. Compute the hash code of the url based on the character's prime number and its position in the url.
After computing the hashcode, insert the hashcode in HashSet (in Java). Do this for every URL. Incase the hashcode of URL mathces the hashcode of some previous url inserted in HashSet, then its the first duplicate.

Comment hidden because of low score. Click to expand.
0

If urls are stored in xml file, then a quick way would be to use Muench's grouping technique to retrieve unique url.

Comment hidden because of low score. Click to expand.
0
of 0 vote

bedalrocks,

Using prime numbers is usually a very bad idea, practically speaking.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Sort the url list using quick sort in O(nlogn). Scan through the array to find the unique URL in O(n) time. Total time: O(nlogn).

PS: Here sorting and scnaning involves string comparision. Say if each string is length K, we can treat n*k = N as another constant.

Comment hidden because of low score. Click to expand.
0

The list of url's is very big, and hence sorting does not seem to be a valid option here.

Comment hidden because of low score. Click to expand.
0
of 0 vote

what abt md5 checksum

Comment hidden because of low score. Click to expand.
0
of 0 vote

Seems list quite a few here have missed a part of the question. The question is to find the first unique URL. Not any unique url or the first duplicate.

A solution would be
* go through the URLs, store in hash table, for key=url, a boolean value that represents if the URL has been seen once or more than once. That is first pass.
* go through the URLs again, and for each URL, check in the hash table, for how many times it has occured. If it has occured only once, that's the URL you want.

Now, there are some space constraints that I haven't considered. Millions of URLs. Let's say 10 million URLs. Each URL has let's say 50 characters on average. That's 500 million characters. That takes up a lot of space even though it is doable. However if I am using a hash table, I am not storing the entire URL, but just the hash code and a boolean. That would be let's say 4 bytes for hash code and a bit for boolean. ~40 MB for hash table. That's not bad at all.

One way to optimize the second pass would be .. if I could create another array that contains the URLs in the order in which they appear in the original list, however without any duplicates. I can construct this during the first pass. So during the second pass, I could go through this array, that way reducing the number of URLs I am processing. However this needs additional memory.

Now if I could make this more challenging, what if the number of URLs is not millions but billions. In that case the hash table cannot be stored in memory. It would be in the order of 40GB. How would we solve the problem then?

One solution I can think of is to use distributed computing. Have multiple server machines store the hash table pieces in their local memory. Since network communication could be faster than disk reads, this could work well. Otherwise the hash table could be stored on disk. This would be slow because you don't know where the hash code of next URL will be in the hash table memory.

Comment hidden because of low score. Click to expand.
0

How will you check for duplicates if you don't store the entire url? After all, if the hash is just 4 bytes, there is a non-zero chance of a collision.

Also, you have perhaps missed an important step: Canonicalize the URLS. There could be multiple versions of the same url, which might hash differently but are actually the same.

Also, your comment about hashtable and hard disk access is incorrect, if you use chained linear hashing. In fact the original intent behind inventing the hash table was to have a fast disk access. An array -> quick access, followed by a linked list access, which would be fast if the size of the linked list (# of collisions for that hash) is small.

And you could just store the least index of the url, (whether it was first, second, nth etc) in the hash table itself and just walk the hash table ignoring dupes. For this just a bit array would be sufficient as a supporting structure to figure out the first unique.

Also, if the urls are valid, about using IP addresses? With a trie (patricia) of IP address + a lookup structure (say hash/tree) for each IP adress, you could perhap cut down on space and make the lookups faster.

Comment hidden because of low score. Click to expand.
0

The linked list/trie combo approach above successfully captures the first unique URL.

Comment hidden because of low score. Click to expand.
0
of 0 vote

One way would be to maintain a Bloom filter for the URL's that you have seen before. They are great for negative membership queries.

Comment hidden because of low score. Click to expand.
0

the question is to find the 1st unique url. If u make the hash of all the url's and then traverse each hash to check the frequency you may be able to get all the unique url's but certainly not the one that comes first.
One way to come up with the 1st unique url is to go thourgh the list order in which the url's are presented to us and then look at the set of unique url's (found by Rajesh's method) to determine which is the 1st unique.

Comment hidden because of low score. Click to expand.
0

I like the bloom filter idea indeed. Now, since the list is very long but finite (ie. not a stream), we can iterate it backwards, keeping track of the last URL that we don't find in the bloom filter.

``````def first_unique_url(alist):
latest_unique = None
bloom = BloomFilter() #assuming some class here
for url in backwards(alist):
if not bloom.contains(url):
latest_unique = url
return latest_unique``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

you can have a trie sort of structure O(n) time complexity, each node can have a flag which says whether a word finishes there or not and a count stating number of occurrences.

Comment hidden because of low score. Click to expand.
0
of 0 vote

HashMap

md5(url) | reference to node in doubly linked list

while( lines ){
if( md5(url) exists )
delete node in in doubly linkedlist
else{
set value to the node ( queue end).
}
}

return the first node of the queue.

I think this can be done in O(n)

Comment hidden because of low score. Click to expand.
0
of 0 vote

Trie would be better. But instead of having count of urls we should have the index of the string in the list. Whenever we encounter the url again make the index = -1. So now the trie will give indexes in each node of trie as Null(never got set) or -1(set more than once) or +ve index value of the url in the list(set only once). now getting all the non null and +ve indexes find the min index you will get first unique URL.

Comment hidden because of low score. Click to expand.
0
of 2 vote

uniq filename.txt > f2.txt

using unix commands? not sure about the time complexity tho :|

Comment hidden because of low score. Click to expand.
0

Or we can use std::sort + std::unique for a C++ position
if unix command is allowed.

Comment hidden because of low score. Click to expand.
0

uniq requires that its input be sorted, and sort takes O(n log(n))

Comment hidden because of low score. Click to expand.
0
of 0 vote

This is well explained in CC book

Comment hidden because of low score. Click to expand.
0

what is the CC book?

Comment hidden because of low score. Click to expand.
0

Cracking the coding interview. You'l find it on the home page of this site.

Comment hidden because of low score. Click to expand.
0
of 0 vote

As fentoyal says, put them into a hash table. Add to each actual bucket entry an iterator into a doubly-linked list of those items that have been seen once. When you walk down the original list of urls, if you don't find it in the hash table, add it, and add its entry (hash table iterator, or raw url, either way) to the back of the linked list. If you do find the entry in the hash table, look at the iterator stored there. If it's null, you've seen it 2+ times already, do nothing. If the iterator there is non-null, use it to pull the item out of the linked list (O(1) time), and then set the iterator in the hash table to null.

When you've gone through the list of urls, pick the front-most (which is the first added, that was not subsequently removed) url from the linked list.

Comment hidden because of low score. Click to expand.
0

I dont think this will give time O(n).

Look at the case when an element was previously considered unique and now as we iterate through the sequence we find that it has a double. In this case we have to remove from hash which is fine as it can be done in O(1) but removing the item from the linked list takes O(n)

doesnt this push the overall running time to O(n^2)

Comment hidden because of low score. Click to expand.
0

No, no - you never remove the url from the hash table (otherwise if you see it again, you'd think it was the first time). There are 3 states - 1) it's a new url (add to hash table and to back of list, add that list iterator to the url's hashtable entry), 2) you've seen the url and the hash table still has a valid iterator to the list (you've seen it once - so we remove the url from the list, and remove the iterator from the url's entry in the hashtable) 3) you've seen the url, and its hash table entry no longer has the iterator -> you've seen the url 2+ times. It's not in the list, so it's in no danger of being thought unique - do nothing).

Comment hidden because of low score. Click to expand.
0

... to answer your objection more specifically, you're talking about case 2). We don't have to iterate through the doubly-linked list to find our candidate, because removing an element from a linked list is O(1), if you know exactly where it is. (I don't know how it is in Java, I'm sure it's the same, but in C++ you can have an iterator (acts like a pointer) to a list element. Being doubly-linked, removal is constant-time, and you don't mess up any other iterators.) And, again, you never take the url's entry out of the hash table.

Comment hidden because of low score. Click to expand.
0

I think space complexity here is way too high.
1. Keeping LONG list of URL as a DLL.
2. Maintaining a HashMap.

Slightly better solution is to use 2 hashmaps.

1. unqiue_url
2. all_url

Traverse through URLS
if URL is present in all_url
remove url from unique_url
else
end

Items left in unique_url will be unique

Comment hidden because of low score. Click to expand.
0

> Items left in unique_url will be unique

... but how will you know which was first? And, my doubly-linked list contains only unique items, it should take less space than another hash table.

Comment hidden because of low score. Click to expand.
0

unqiue_url you can store url as key and value as the order of occurence 1, 2, 3, 4. or use a Map which keeps track of insertion order, but I guess that uses a DLL internally.

I did not see that you only store unique items in DLL.

Both solutions have same complexity, guess choice would be between what is easier to read and maintain

Comment hidden because of low score. Click to expand.
0

if you change the unique map to a vector (C++) or arraylist (Java), you could have the position in the unique_url list as the value of the all_url map and just remove it from the list right away. This solves the knowing which was first problem since lists preserve order.

Comment hidden because of low score. Click to expand.
0
of 0 vote

I guess you can first create a hashmap of all the url whose key contains
i) no of occurrences of the url
ii) corresponding linked list node containing the url.

So whenever a new url is met, it is determined whether the url is contained in the hashmap.
i) If it is contained and the no of occurences is 1, then we increase the number of occurences by 1 and delete the corresponding linked list node.
ii)If it is not contained in the hash map, then we create a new node and connect it to the tail of the linked list. Then we update the hash map accordingly.
iii) If it is contained in the hash map, and the number of occurrences is more than 1, it means that the corresponding node has already been deleted, and we do not need to do anything.

Comment hidden because of low score. Click to expand.
0

This should work but you need to use a doubly linked list and you need to remember a pointer to the node in the doubly linked list in the hash map as well. Once an url is found to be duplicate you increment the counter, delete the node and also make the pointer in the hash map null. Rest should work as you mentioned.

Comment hidden because of low score. Click to expand.
0
of 0 vote

You use mixed combination of trie and hashing on the reversed URL.
Basically by reversing the URL and treating the domains in it as complex keys you reduce the number of collisions (host.dom1.dom2 -> dom2(trie).dom1(trie).host(hash))

Comment hidden because of low score. Click to expand.
0
of 0 vote

<pre lang="" line="1" title="CodeMonkey48234" class="run-this">Dictionary<string, int> urlCount = new Dictionary<string, int>();
Queue<string> uniqueUrls = new Queue<string>();

{
if(!urlCount.Contains(url))
{
uniqueUrls.Enqueue(url);
}
else
{
//count > 1
urlCount[url]++;
if(uniqueUrls.Peek() == url)
{
uniqueUrls.Dequeue();
}
}
}

return uniqueUrls.Dequeue();</pre><pre title="CodeMonkey48234" input="yes">
</pre>

Comment hidden because of low score. Click to expand.
0
of 0 vote

<pre lang="" line="1" title="CodeMonkey11335" class="run-this">package career.cup.google;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.NoSuchElementException;
import java.util.Random;

/**
* @author Given a very long list of URLs, find the first URL which is
* unique ( occurred exactly once ). I gave a O(n) extra space and O(2n)
* time solution, but he was expecting O(n) time, one traversal.
*/
public class G94FirstUniqueURL {

/**
* @param url
* @returns 1st non duplicate URL, otherwise throws NoSuchElementFind exception...
*
* traverse the List check if the URL was find delete his
* correspondent index from the LinkedHashSet index keep that URL as duplicate....
* Will use a hash map to register the list traversal
* status, key = URL, Value = A list of indexes in the list,
* Will use A LinkedHashSet for the indexes which assume are not duplicates, and remove those which are...
* Used this data structure because will contain the first element which is unique at the end of the algorithm..
* If this hash set is empty throw No Such Element Find...
*/
public static String findFirstUniqueURL(ArrayList<String> url) {

HashMap<String, HashSet<Integer>> hm = new HashMap<String, HashSet<Integer>>();
Iterator<String> it = url.iterator();
int listIndex = -1;
while (it.hasNext()) {
String u = it.next();
listIndex++;
if (!hm.containsKey(u)) {
HashSet<Integer> idx = new HashSet<Integer>();
hm.put(u, idx);
} else {
deleteDuplicateIndexes(hm.get(u), fIndex);
hm.put(u, new HashSet<Integer>());
}
}
if (fIndex.isEmpty())
throw new NoSuchElementException("No Unique URL");
return url.get(fIndex.iterator().next());
}

/**
* @param idx - duplicate indexes list
* @param in - indexes processed
*/
private static void deleteDuplicateIndexes(HashSet<Integer> idx,
for (Integer i : idx) {
in.remove(i);
}
}

public static void main(String[] args) {
Random r = new Random();
ArrayList<String> url = new ArrayList<String>();
for (int i = 0; i < 50; i++) {
Integer v = r.nextInt(37);
System.out.printf("%d ", v);
}
System.out.println();
System.out.println(findFirstUniqueURL(url));

}
}

</pre><pre title="CodeMonkey11335" input="yes">
</pre>

Comment hidden because of low score. Click to expand.
0
of 0 vote

you can solve this in O(n) using a combination of trie and linked list. The leaf node of a trie maintains a flag to record duplicate urls and pointer to a node in a link list. If you encounter a new url, add a node to the head of the linked list and set the pointer in the trie. Whenever you encounter a url that is already in the trie, if the flag is not set, then set the flag, delete the node from the linked list and set pointer to null. If the flag is already set, then ignore and read the next url.
After processing all the urls, the link list should only contain the unique urls and the node at the tail is the first unique url from the list. For n urls, inserting urls into the trie in O(n) and link list operations are all constant time. The node could just keep the index of the url in the list so that we don't have to store urls in the link list as well.

Comment hidden because of low score. Click to expand.
0
of 0 vote

Wouldnt the algo below work ?

``````public static int findUnique(Url[] arr) {
HashMap<Url, Integer> map = new HashMap<Url, Integer>();
int minKey = Integer.MIN_VALUE;
for (int i = 0; i < arr.length; ++i) {
Url num = arr[i];
if (map.containsKey(num)) {
int count = map.get(num);
count++;
map.put(num, count);
if (num == minKey) {
minKey = Integer.MIN_VALUE;
}
} else {
if (minKey == Integer.MIN_VALUE) {
minKey = num;
}
map.put(num, 1);
}
}

return minKey;
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

``````import java.util.ArrayList;
import java.util.HashMap;

public class UniqueUrl {

HashMap<String, Boolean> urlKeyMap;
ArrayList<String> urlList;

public static void main(String[] args) {
UniqueUrl uniqueUrl = new UniqueUrl();
String[] stringArr = {"aab","abc","abc","abd","abd"};
String nonRepeatedUrl = uniqueUrl.getUniqueUrl(stringArr);
if(nonRepeatedUrl == null){
System.out.println("No such string");
}else{
System.out.println(nonRepeatedUrl);
}
}

private String getUniqueUrl(String[] arr) {
urlKeyMap = new HashMap<String, Boolean>();
urlList = new ArrayList<String>();
for(String str : arr){
if(urlKeyMap.containsKey(str)){
if(urlList.contains(str)){
urlList.remove(str);
}
}else{
urlKeyMap.put(str,true);
}
}
if(urlList.size()==0){
return null;
}
return urlList.get(0);
}
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

``````import java.util.ArrayList;
import java.util.HashMap;

public class UniqueUrl {

HashMap<String, Boolean> urlKeyMap;
ArrayList<String> urlList;

public static void main(String[] args) {
UniqueUrl uniqueUrl = new UniqueUrl();
String[] stringArr = {"aab","abc","abc","abd","abd"};
String nonRepeatedUrl = uniqueUrl.getUniqueUrl(stringArr);
if(nonRepeatedUrl == null){
System.out.println("No such string");
}else{
System.out.println(nonRepeatedUrl);
}
}

private String getUniqueUrl(String[] arr) {
urlKeyMap = new HashMap<String, Boolean>();
urlList = new ArrayList<String>();
for(String str : arr){
if(urlKeyMap.containsKey(str)){
if(urlList.contains(str)){
urlList.remove(str);
}
}else{
urlKeyMap.put(str,true);
}
}
if(urlList.size()==0){
return null;
}
return urlList.get(0);
}
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

``````public static String findFirstUnique(String [] list) {
Set<String> seen = new HashSet<String>();
Set<String> notSeen = new HashSet<String>();

for (String s : list) {

if (!seen.contains(s)) {
} else {
notSeen.remove(s);
}
}

return notSeen.isEmpty() ? null : notSeen.iterator().next();
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think you are on to something. This can all be done with Linux/Unix commands.

if the master list of urls are in a text file (master.txt) then

sort master.txt | uniq -c > occurances.txt

will give you a file of the occurances for each url.

Obviously, the url(s) with associated occurance of one is what you want.

If for some reason the urls are scattered in different files you can use grep

to pipe them into one file with something like

grep -r "http:*" *.* > master.txt

I don't know what the run time of unix sort is but why would anyone write

a C++ program when Unix/Linux can tell you with no effort.

Comment hidden because of low score. Click to expand.
0
of 0 vote

You can traverse the list from the end to the begining, then the last unique one in this method is the first one and the problem will be easier.

Comment hidden because of low score. Click to expand.
0
of 0 vote

ok. so create two has tables.
One will store dup urls. Other will store unique urls.

2. Try to insert hash into dup url hash. If success remove it. else go to step 1. Ignore insertion failures in dup url hash table.
3. Insert into unique url hash. If hash exists error will be generated. Remove that key from uniqu url hash and insert it into dup url hash. Ignore insert failures for dup url hash table. go to step 1
3. The unique url hash table contains all the urls that are unique.

Comment hidden because of low score. Click to expand.
0
of 0 vote

I think all of you missed that the guy wanted a real string algo... pretty hard core:

``````#include <iostream>
#include <map>
#include <string>
#include <vector>
using namespace std;

class node{
public:
map<char, node*> children;
char c;
int index;
};

int find_first_unique(vector<string> & urls)
{
int result_index = -1;
node * root = new node;
root->c = '*';
bool repeat[urls.size()];
if(urls.size() <= 0)
{
return -1;

}

for(int i = 0; i < urls.size(); i++)

{

string & url = urls[i];

node * n = root;

for(int j = 0; j < url.length(); j ++)

{

node * child;

if((child = n->children[url[j]]) == NULL)

{

child = new node;

child->c = url[j];

n->children[child->c] = child;

child->index = i;

}

n = child;

}

if(n->index != i)

{

repeat[i]=true;

repeat[n->index]=true;

}

else{

repeat[n->index] = false;

}

}

for(int i =0; i < urls.size(); i ++)

{

if(repeat[i] == false)

{

return i;

}

}
return -2;
}

int main(int argc, char ** a)
{

vector<string> v;
v.push_back("AAAA");
v.push_back("AAAA");
v.push_back("AAAAC");
v.push_back("BBBB");
cout <<find_first_unique(v)<< endl;
return 0;
}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

This does the job in cleaner, shorter, faster and least memory possible

``````static String findUniqueUrl(List<String> urls) {

/** preserves insertion order, so you can eventually tell the first unique url.
it provides constant-time performance for the basic operations (add, contains and remove),
assuming the the hash function disperses elements properly among the buckets
**/

for (String url : urls) {
if (unique.contains(url)) {
/** removes a url if duplicated,
only requires space O(unique urls)  **/
unique.remove(url);
} else {
/** add url the first time its found **/
}

}

/** get the first found unique url **/
Iterator<String> it = unique.iterator();
String firstunique = null;
while (it.hasNext()) {
firstunique = it.next();
break;
}
output(unique);
return firstunique;

}``````

Comment hidden because of low score. Click to expand.
0
of 0 vote

Can try
1) adding all values in hash table O(n)
2)Perform linear search on hash table O(n)

Comment hidden because of low score. Click to expand.
0

Are you serious?? Interviewer explicitly mentioned tht its a long list. Hashtable is rejected in the question itslf indirectly

Comment hidden because of low score. Click to expand.
0
of 0 vote

z11z

Comment hidden because of low score. Click to expand.
-1
of 1 vote

Use min heap with url as data of heap and url_frequency as the heap comparator.
Top of the heap would give u an unique url if its frequency is exactly 1.

O(nlogn) time complexity, O(n) space complexity.

Comment hidden because of low score. Click to expand.
0

Sorry, this doesnt work in O(nlogn).
When we see a duplicated url, to find the url and increase its url_frequency count is O(n) (Searching in heap is O(n))

Hence, the overall time complexity is O(n^2).

Comment hidden because of low score. Click to expand.
-1
of 1 vote

i think following code will work.

package psan;
import java.io.*;
import java.util.*;

public class UniqueUrlSearch

{
public static void main(String...s) throws Exception
{
System.out.println("Enter the list");
int len=list.length,itr=-1;
int old=1,current=1,v=0;
String tmp="",yy="";
HashMap<String,String> hs=new HashMap<String,String>();
for(int i=0;i< len;i++)
{

if(!hs.containsKey(list[i]))
{
try{++itr;
yy=yy+itr;
tmp=tmp+yy+" "+"1";
System.out.println("in if tmp="+tmp);
hs.put(list[i],tmp);
}catch(Exception e){e.printStackTrace();}
}
else
{
try{

System.out.println("in else:");
String ss=hs.get(list[i]);
String x[]=ss.split(" ");
//for(int j=0;j<x.length;j++) System.out.print(" "+x[j]);
int key=new Integer(x[1]);
int index=new Integer(x[0]);
key+=old;
tmp="";
tmp=tmp+index+" "+key;
hs.put(list[i],tmp);
//System.out.println("index="+index+" "+"key="+key);
if((index-v) >= 0)
ans.remove(index-v);
++v;
}catch(Exception e1){e1.printStackTrace();}

}
yy="";
tmp="";

}
System.out.println("ans="+ans);
}

}

Comment hidden because of low score. Click to expand.
-1
of 1 vote

package psan;
import java.io.*;
import java.util.*;

public class UniqueUrlSearch

{
public static void main(String...s) throws Exception
{
System.out.println("Enter the list");
int len=list.length,itr=-1;
int old=1,current=1,v=0;
String tmp="",yy="";
HashMap<String,String> hs=new HashMap<String,String>();
for(int i=0;i< len;i++)
{

if(!hs.containsKey(list[i]))
{
try{++itr;
yy=yy+itr;
tmp=tmp+yy+" "+"1";
System.out.println("in if tmp="+tmp);
hs.put(list[i],tmp);
}catch(Exception e){e.printStackTrace();}
}
else
{
try{

System.out.println("in else:");
String ss=hs.get(list[i]);
String x[]=ss.split(" ");
//for(int j=0;j<x.length;j++) System.out.print(" "+x[j]);
int key=new Integer(x[1]);
int index=new Integer(x[0]);
key+=old;
tmp="";
tmp=tmp+index+" "+key;
hs.put(list[i],tmp);
//System.out.println("index="+index+" "+"key="+key);
if((index-v) >= 0)
ans.remove(index-v);
++v;
}catch(Exception e1){e1.printStackTrace();}

}
yy="";
tmp="";

}
System.out.println("ans="+ans);
}

}

Comment hidden because of low score. Click to expand.
-2
of 2 vote

I was asked this question by an interviewer. I felt that no algorithm can be faster than O(n^2). But, I saw the methods using the hash table. Though I get the idea to use the hash table, the question simply shifts from finding the first unique URL to finding the optimal hashing function. There can still be collisions and that can end up increasing the search in the unique list from O(1).

Comment hidden because of low score. Click to expand.
-2
of 2 vote

I was asked this question by an interviewer. I felt that no algorithm can be faster than O(n^2). But, I saw the methods using the hash table. Though I get the idea to use the hash table, the question simply shifts from finding the first unique URL to finding the optimal hashing function. There can still be collisions and that can end up increasing the search in the unique list from O(1).

Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

### Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

### Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.