Facebook Interview Question
Software Engineer / DevelopersCountry: United States
Interview Type: In-Person
When we create hashtable of info -> Node. the info code be same phone number belonging to two diff entries or email ids belonging to two different entries.
The previous hashtable entry will be overritten. In that case how will DFS reach that node if we are using Hashtable to store the graph.?
Create a graph of contact Nodes.
Connect two node if they share a common email, phone or name.
The contacts to be merged will form a connected component.
Number of connected components is total unique contacts
Just to make sure, the complexity time is O(n), right?
Iterate over the list and create node for each entry + add each contact info to hash - O(n).
Run BFS and perform merge - O(n) - we visit each node maximum two times.
Don't we require O(n^2) just to create the graph? Is there any better solution with union-find?
public static void main(String[] args) {
// TODO Auto-generated method stub
HashMap<String, Integer> h = new HashMap();
String[][] A = { { "John", "john@gmail.com", "john@fb.com"},
{ "Dan", "dan@gmail.com", "+1234567"},
{"john123", "+5412312", "john123@skype.com"},
{ "john1985", "+5412312", "john@fb.com"} };
int code = 0;
for(int i = 0; i < A.length; i++) {
int c = -1;
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
if(h.containsKey(s)) {
c = h.get(s);
break;
}
}
if(c == -1) {
c = code++;
}
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
h.put(s, c);
}
}
for(int i = 0; i < A.length; i++) {
System.out.println(h.get(A[i][0]));
}
}
There is a small issue in the code i.e it connects elements which have first common element. In above case it will connect john123 and john1985 but it will not connect John to it.
Rewriting code to merge all connected contacts:
public static void main(String[] args) {
// TODO Auto-generated method stub
HashMap<String, Integer> h = new HashMap();
String[][] A = { { "John", "john@gmail.com", "john@fb.com"},
{ "Dan", "dan@gmail.com", "+1234567"},
{"john123", "+5412312", "john123@skype.com"},
{ "john1985", "+5412312", "john@fb.com"} };
int code = 0;
for(int i = 0; i < A.length; i++) {
int c = -1;
Set<Integer> groupings = new HashSet<Integer>();
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
if(h.containsKey(s)) {
int value=h.get(s);
grouping.add(value);
c = Math.min(c,value);
}
}
if(c == -1) {
c = code++;
}
else
{
groupings.remove(c);
for(Integer k : groupings )
{
for(int m=0;m<A[0].length;m++)
{
h.put(A[k][m],c);
}
}
}
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
h.put(s, c);
}
}
for(int i = 0; i < A.length; i++) {
System.out.println(h.get(A[i][0]));
}
}
Above code has bug that it merge contact based on first common findings, while not considering other matchings as well.
Corrected code:
public static void main(String[] args) {
// TODO Auto-generated method stub
HashMap<String, Integer> h = new HashMap();
String[][] A = { { "John", "john@gmail.com", "john@fb.com"},
{ "Dan", "dan@gmail.com", "+1234567"},
{"john123", "+5412312", "john123@skype.com"},
{ "john1985", "+5412312", "john@fb.com"} };
int code = 0;
for(int i = 0; i < A.length; i++) {
int c = -1;
Set<Integer> groupings = new HashSet<Integer>();
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
if(h.containsKey(s)) {
int value=h.get(s);
grouping.add(value);
c = Math.min(c,value);
}
}
if(c == -1) {
c = code++;
}
else
{
groupings.remove(c);
for(Integer k : groupings )
{
for(int m=0;m<A[0].length;m++)
{
h.put(A[k][m],c);
}
}
}
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
h.put(s, c);
}
}
for(int i = 0; i < A.length; i++) {
System.out.println(h.get(A[i][0]));
}
}
SK has given an overview about the solution, here's the details and code
1.) Convert contact array to list of Contact Object
2.) Create a map with key all the contact details(email, phone) and value be list of Contact(Person1, person2).
3.) Go through the map and for each contact in the list get the contact details(email, phone) and go over the map recursively and find out what other contact are mapped, while doing so mark contact as visited true.
4.) Add the person to a set.
5.) Add set to result list
import java.util.*;
class UnionPerson2 {
static class Contact {
List<String> contactList;
String name;
boolean visited;
Contact(List<String> contactList, String name, boolean visited) {
this.contactList = contactList;
this.name = name;
this.visited = visited;
}
}
public static void main(String a[]) {
String [][]contacts = {{"John", "john@gmail.com", "john@fb.com"},
{"Dan", "dan@gmail.com", "+1234567"},
{"john123", "5412312", "john123@skype.com"},
{"john1985", "5412312", "john@fb.com"},
{"john19856", "john123@skype.com", "john@fb1.com"},
{"Dan2", "dan123@gmail.com", "+1234567"},{"Dan3", "dan@gmail.com", "+123456712312"},
{"Sandy", "sandy@gmail.com", "+123456712"},{"sandy4", "sandy@fb.com", "sandy@gmail.com"}};
List<Contact> conList = new ArrayList<Contact>();
for(int i = 0 ; i < contacts.length; i++) {
List<String> contactList = new ArrayList<String>();
for(int j = 1 ; j < contacts[i].length; j++) {
contactList.add(contacts[i][j]);
}
Contact con = new Contact(contactList, contacts[i][0], false);
conList.add(con);
}
Map<String, List<Contact>> map = new HashMap<String, List<Contact>>();
for(Contact c: conList) {
List<Contact> indexList = new ArrayList<Contact>();
for(String detail: c.contactList) {
if(map.containsKey(detail)) {
indexList = map.get(detail);
indexList.add(c);
map.put(detail, indexList);
} else {
indexList = new ArrayList<Contact>();
indexList.add(c);
map.put(detail, indexList);
}
}
}
List<Set<Contact>> resultList = new ArrayList<Set<Contact>>();
for(List<Contact> ls: map.values()) {
Set<Contact> resultSubList = new HashSet<Contact>();
for(Contact c: ls) {
if(!c.visited) {
resultSubList = findUnion(map, c, resultSubList);
}
}
resultList.add(resultSubList);
}
for(Set<Contact> subList: resultList) {
if(subList.size() > 0) {
for(Contact co: subList) {
System.out.print(co.name + ", ");
}
System.out.println();
}
}
}
static Set<Contact> findUnion(Map<String, List<Contact>> map, Contact cont, Set<Contact> resultSubList) {
if(!cont.visited){
cont.visited = true;
resultSubList.add(cont);
for(String contactListStr: cont.contactList) {
for(Contact c: map.get(contactListStr)){
if(!c.visited) {
resultSubList.add(c);
findUnion(map, c, resultSubList);
}
}
}
}
return resultSubList;
}
}
#define MAX_STRING_LNG 100
#include<stdio.h>
#include<conio.h>
#include<string.h>
struct contactInfo{
char name[MAX_STRING_LNG];
char contactAdd1[MAX_STRING_LNG];
char contactAdd2[MAX_STRING_LNG];
};
int main(){
int size=0;
scanf("%d",&size);
int samePerson[size];
int samePersonCounter = 0;
int anotherPerson[size];
int anotherPersonCounter = 0;
struct contactInfo contactDetails[size];
for(int i=0; i<size; i++){
scanf("%s",contactDetails[i].name);
scanf("%s",contactDetails[i].contactAdd1);
scanf("%s",contactDetails[i].contactAdd2);
}
for(int i=0; i<size; i++){
for(int j=1; j<size; j++){
int firstContactResult = strcmp(contactDetails[i].contactAdd1,contactDetails[j].contactAdd1);
int secondContactResult = strcmp(contactDetails[i].contactAdd2,contactDetails[j].contactAdd2);
if(firstContactResult == 0 || secondContactResult == 0){
anotherPerson[anotherPersonCounter] = i;
anotherPersonCounter++;
break;
}
else{
samePerson[samePersonCounter] = i;
samePersonCounter++;
break;
}
}
}
printf("\n\n");
for(int i=0; i<samePersonCounter; i++){
printf("%d",samePerson[i]);
}
printf("\n\n");
for(int i=0; i<anotherPersonCounter; i++){
printf("%d",anotherPerson[i]);
}
}
Above code has bug that it merge contact based on first common findings, while not considering other matchings as well.
Corrected code:
public static void main(String[] args) {
// TODO Auto-generated method stub
HashMap<String, Integer> h = new HashMap();
String[][] A = { { "John", "john@gmail.com", "john@fb.com"},
{ "Dan", "dan@gmail.com", "+1234567"},
{"john123", "+5412312", "john123@skype.com"},
{ "john1985", "+5412312", "john@fb.com"} };
int code = 0;
for(int i = 0; i < A.length; i++) {
int c = -1;
Set<Integer> groupings = new HashSet<Integer>();
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
if(h.containsKey(s)) {
int value=h.get(s);
grouping.add(value);
c = Math.min(c,value);
}
}
if(c == -1) {
c = code++;
}
else
{
groupings.remove(c);
for(Integer k : groupings )
{
for(int m=0;m<A[0].length;m++)
{
h.put(A[k][m],c);
}
}
}
for(int j = 0; j < A[i].length; j++) {
String s = A[i][j];
h.put(s, c);
}
}
for(int i = 0; i < A.length; i++) {
System.out.println(h.get(A[i][0]));
}
}
you can looks like this
for (int i = 0; i < size;++i)
{
for (int j = i + 1; j < size; ++j)
{
//index the similar contacts in sameperson array
}
}
First of all, thank you for the warm and fresh new question!!.
Here is C++ version of the solution.
This algorithm first construct the graph with each row as an edge and phone/email as a vertex.
I used DFS to find the connected graph.
Running time of this algorithm is O(N), where N is the number of distinct phone/email string.
Using the hashtable, constructing a graph takes O(N) although I did not implement hashtable in this solution. std::map will provide log(n) search time.
So, it will be O(N Log N ).
#include <map>
#include <list>
#include <string>
#include <iostream>
using namespace std;
struct edge;
struct vertext;
struct vertex {
string value;
bool visited;
vertex * parent;
list<edge *> edges;
map<string, edge*> merged;
vertex(string v): value(v), parent(0), visited(false){}
};
struct edge{
vertex *v;
vertex *w;
string* original;
string value;
edge(string v, string* o):value(v), original(o){}
};
void dfs(vertex *v, vertex *parent)
{
list<edge *>::iterator iter;
list<edge *>::iterator subiter;
for (iter = v->edges.begin(); iter != v->edges.end(); iter++)
{
edge *e = (*iter);
vertex* w = (e->v->value != v->value)? e->v : e->w;
if (w->visited != true)
{
w->visited = true;
w->parent = parent;
for (subiter = w->edges.begin(); subiter != w->edges.end(); subiter++)
{
if (parent->merged.find((*subiter)->value) == parent->merged.end())
{
parent->merged[(*subiter)->value] = (*subiter);
}
}
dfs(w, parent);
}
}
}
list<list<string>> merge_contact(string ** input, int len)
{
list<list<string>>result;
map<string, vertex *> vertices;
map<string, vertex *>::iterator iter;
//create the graph
for (int i = 0; i <len; i++)
{
edge * e = new edge(input[i][0], input[i]);
if ((iter = vertices.find(input[i][1])) != vertices.end())
{
e->v = iter->second;
e->v->edges.push_back(e);
} else {
vertex * v = new vertex(input[i][1]);
vertices[v->value] = v;
v->edges.push_back(e);
e->v = v;
}
if ((iter = vertices.find(input[i][2])) != vertices.end())
{
e->w = iter->second;
e->w->edges.push_back(e);
} else {
vertex * w = new vertex(input[i][2]);
vertices[w->value] = w;
w->edges.push_back(e);
e->w = w;
}
}
for(iter = vertices.begin(); iter != vertices.end(); iter++)
{
if (iter->second->visited != true)
{
iter->second->visited = true;
dfs(iter->second, iter->second);
}
}
for (iter = vertices.begin(); iter!= vertices.end(); iter++)
{
vertex *v = iter->second;
if (!v->parent)
{
list<string> m;
map<string, edge*>::iterator i;
for(i = v->merged.begin(); i != v->merged.end(); i++)
{
edge * e = i->second;
m.push_back(e->original[0]);
m.push_back(e->original[1]);
m.push_back(e->original[2]);
}
result.push_back(m);
}
}
return result;
}
int main()
{
string ** input = new string*[4];
string s1[3] = {"John", "john@gmail.com", "john@fb.com"};
string s2[3] = {"Dan", "dan@gmail.com", "+1234567"};
string s3[3] = {"john123", "+5412312", "john123@skype.com"};
string s4[3] = {"john1985", "+5412312", "john@fb.com"};
input[0] = s1;
input[1] = s2;
input[2] = s3;
input[3] = s4;
list<list<string>>result = merge_contact(input, 4);
list<list<string>>::iterator i;
list<string>::iterator i2;
for (i = result.begin(); i != result.end(); i++)
{
cout << "[ ";
for (i2 = (*i).begin(); i2 != (*i).end(); i2++)
{
cout<<(*i2) <<", ";
}
cout<< " ], ";
}
cout<< endl;
return 1;
}
Your code fails for this input:
string ** input = new string*[8];
string s0[3] = { "t", "x", "y" };
string s1[3] = { "g", "h", "i" };
string s2[3] = { "r", "s", "t" };
string s3[3] = { "o", "m", "n" };
string s4[3] = { "y", "h", "w" };
string s5[3] = { "e", "z", "v" };
string s6[3] = { "a", "b", "c" };
string s7[3] = { "f", "b", "e" };
input[0] = s0;
input[1] = s1;
input[2] = s2;
input[3] = s3;
input[4] = s4;
input[5] = s5;
input[6] = s6;
input[7] = s7;
list<list<string>>result = merge_contact(input, 8);
Looks like this was expected to be written in Python..
# Build reverse hash of lists using content as key and array of index as value
datamap = {}
for i,j in enumerate(data):
for k in j:
if k in datamap:
datamap[k] = datamap[k] + [i]
else:
datamap[k] = [i]sets = [set(i) for i in datamap.values() if i]
# Now merge/union the intersecting sets
sets = [set(i) for i in datamap.values() if i]
merged = True
while sets and merged:
first, rest = sets[0], sets[1:]
sets = []
merged = False
for x in rest:
if first.intersection(x):
first = first.union(x)
merged = True
else:
sets.append(x)
sets.append(first)
res = [list(a) for a in sets]
print res
# Build reverse hash of lists using content as key and array of index as value
# hashmap with int won't work cause dup key get overwritten immediately.
datamap = {}
for i,j in enumerate(data):
for k in j:
if k in datamap:
datamap[k] = datamap[k] + [i]
else:
datamap[k] = [i]
print datamap
# Now merge/union the intersecting sets
sets = [set(i) for i in datamap.values() if i]
merged = True
while sets and merged:
first, rest = sets[0], sets[1:]
sets = []
merged = False
for x in rest:
if first.intersection(x):
first = first.union(x)
merged = True
else:
sets.append(x)
sets.append(first)
print "res : %s"%sets
res = [list(a) for a in sets]
print res
contacts = [["John", "john@gmail.com", "john@fb.com"],
["john123", "+5412312", "john123@skype.com"],
["john1985", "+5412312", "john@fb.com"],
["Dan", "dan@gmail.com", "+1234567"],
["Dan123", "dan@gmail.com", "+1876234"],
["DannyBoy", "dan@fb.com", "+1234567"],]
def overlaps(c1, c2):
return (c1 & c2) != set([])
contacts = [set(c) for c in contacts]
merged_contacts = []
while contacts:
group = [contacts.pop(0)]
for existing in group:
for contact in contacts:
if overlaps(existing, contact):
group.append(contact)
contacts.remove(contact)
merged_contacts.append(group)
print merged_contacts
Here is the swift implementation
func filterContacts(contacts: [[String]]) -> [[Int]]? {
if contacts.count == 0 {
return nil
}
var result = [[Int]]()
var added = [Int]()
for i in 0..<contacts.count-1 {
var map = [String: Int]()
for str in contacts[i] {
map[str] = 1
}
for j in i+1..<contacts.count {
var unique = true
for element in contacts[j] {
if map[element] != nil {
unique = false
break
}
}
if unique {
if !added.contains(i) {
result.append([i])
added.append(i)
}
} else {
if added.contains(i) {
for k in 0..<result.count {
if result[k].contains(i) {
if !added.contains(j){
result[k].append(j)
added.append(j)
}
break
}
}
} else if added.contains(j) {
for k in 0..<result.count {
if result[k].contains(j) {
if !added.contains(i) {
result[k].append(i)
added.append(i)
}
break
}
}
} else {
result.append([i, j])
added.append(i)
added.append(j)
}
}
}
}
return result
}
Swift implementation
func filterContacts(contacts: [[String]]) -> [[Int]]? {
if contacts.count == 0 {
return nil
}
var result = [[Int]]()
var added = [Int]()
for i in 0..<contacts.count-1 {
var map = [String: Int]()
for str in contacts[i] {
map[str] = 1
}
for j in i+1..<contacts.count {
var unique = true
for element in contacts[j] {
if map[element] != nil {
unique = false
break
}
}
if unique {
if !added.contains(i) {
result.append([i])
added.append(i)
}
} else {
if added.contains(i) {
for k in 0..<result.count {
if result[k].contains(i) {
if !added.contains(j){
result[k].append(j)
added.append(j)
}
break
}
}
} else if added.contains(j) {
for k in 0..<result.count {
if result[k].contains(j) {
if !added.contains(i) {
result[k].append(i)
added.append(i)
}
break
}
}
} else {
result.append([i, j])
added.append(i)
added.append(j)
}
}
}
}
return result
}
Here's a plain and simple JS solution. Spiffed up the input a bit to create more complexity.
var input = [[ "John", "john@gmail.com", "john@fb.com"],
[ "Dan", "dan@gmail.com", "+1234567"],
[ "john123", "+5412312", "john123@skype.com"],
[ "john1985", "+5412312", "john@fb.com"],
[ "Billy Bob", "+3453555", "john123@skype.com"],
[ "Sandra", "+224551", "sandra@gmail.com"],
[ "Gina", "+3453555", "gina@skype.com"],
[ "Virgil", "+224551", "virgil@skype.com"]];
var contacts = (function(){
var nodes = {};
function Node (index) {
this.visited = false;
this.linked = [];
this.index = index;
}
Node.prototype.linkTo = function (node) {
this.linked.push(node);
};
Node.prototype.isVisited = function () {
return this.visited;
};
Node.prototype.getCluster = function (output) {
var i = 0;
var linkedLength = this.linked.length;
output || (output = []);
this.visited = true;
output.push(this.index);
for (i; i < linkedLength; i++) {
if (!nodes[this.linked[i]].isVisited()) {
nodes[this.linked[i]].getCluster(output);
}
}
// make it pretty
return output.sort();
}
return {
findIdenticalContacts: function (input) {
var i, j, row, rowLength, rowEntry, node;
var map = {};
for (i in input) {
row = input[i];
rowLength = row.length;
node = new Node(i);
nodes[i] = node;
for (j = 0; j < rowLength; j++) {
rowEntry = row[j];
if (map[rowEntry]) {
node.linkTo(map[rowEntry]);
nodes[map[rowEntry]].linkTo(i);
}
map[rowEntry] = i;
}
}
for (i in nodes) {
node = nodes[i];
if (!node.isVisited()) {
// doing console out, we can do one extra step to push all these into output
console.log(node.getCluster());
}
}
}
};
}());
contacts.findIdenticalContacts(input);
var input = [[ "John", "john@gmail.com", "john@fb.com"],
[ "Dan", "dan@gmail.com", "+1234567"],
[ "john123", "+5412312", "john123@skype.com"],
[ "john1985", "+5412312", "john@fb.com"],
[ "Billy Bob", "+3453555", "john123@skype.com"],
[ "Sandra", "+224551", "sandra@gmail.com"],
[ "Gina", "+3453555", "gina@skype.com"],
[ "Virgil", "+224551", "virgil@skype.com"]];
var contacts = (function(){
var nodes = {};
function Node (index) {
this.visited = false;
this.linked = [];
this.index = index;
}
Node.prototype.linkTo = function (node) {
this.linked.push(node);
};
Node.prototype.isVisited = function () {
return this.visited;
};
Node.prototype.getCluster = function (output) {
var i = 0;
var linkedLength = this.linked.length;
output || (output = []);
this.visited = true;
output.push(this.index);
for (i; i < linkedLength; i++) {
if (!nodes[this.linked[i]].isVisited()) {
nodes[this.linked[i]].getCluster(output);
}
}
// make it pretty
return output.sort();
}
return {
findIdenticalContacts: function (input) {
var i, j, row, rowLength, rowEntry, node;
var map = {};
for (i in input) {
row = input[i];
rowLength = row.length;
node = new Node(i);
nodes[i] = node;
for (j = 0; j < rowLength; j++) {
rowEntry = row[j];
if (map[rowEntry]) {
node.linkTo(map[rowEntry]);
nodes[map[rowEntry]].linkTo(i);
}
map[rowEntry] = i;
}
}
for (i in nodes) {
node = nodes[i];
if (!node.isVisited()) {
// doing console out, we can do one extra step to push all these into output
console.log(node.getCluster());
}
}
}
};
}());
contacts.findIdenticalContacts(input);
Create a Node for each Entry with the following structure:
- SK July 12, 2015Node {
List<Strings> info
int index
boolean visited
}
For every string entry in "info", hash the Node into the Hashtable. Then, for each node, run DFS. Add all the Nodes to a visited list, and make a set out of these nodes. Next, run DFS for another Node. We know that if it is visited, then it already exists in the set. Keep running DFS until we exhaust all Nodes.