Morgan Stanley Interview Question
Java DevelopersTeam: Risk
Country: United States
Interview Type: Phone Interview
How is Query 3's complexity O(1)? don't you have to find city first in the linked list, which is O(n) itself?
Why not a some kind of set? For instance HashSet. From my point countries can't duplicate. And in this case HashSet should be fine because the first query can be addressed in O(n) in worst case but in average it should be faster than in LinkedList. The same is for states and cities.
Brilliant implementation but as tarun mentioned 3rd query cannot be done in O(1) since it takes O(n) to find the city and O(1) to find the state and country.
So O(n)+O(1) rounds off to O(n)
Great approach,
but for searching the city we will have to traverse through nodes of each country and each state (unless we maintain a reference to all the linked list of cities).
In which case the pointer of city node (1. One that points back to it's state node ) becomes unimportant and can be avoided to reduce space complexity.
1 and 2 can be done in O(1) time using hash tables:
Map<String, ArrayList<String>> countries_ht = new HashMap();
Where key is the country name and the list is the list of the state names
Map<String, ArrayList<String>> states_ht = new HashMap();
Where Key is the state name and the list is the list of the city names, then:
1) find list of states for a country.
List<String> states = countries_ht.get("USA")
2) find list of cities for a state.
List<String> cities = states_ht.get("California")
For the 3. query you need to have reverse lookup tables also using hash tables
Map<String, String> city_to_state_rlt = new HashMap();
Where Key is the city name and the value is the state that the city is in
Map<String, String> state_to_county_rlt = new HashMap();
Where Key is the state name and the value is the country that the state is in
3) find the name of the country and state for a city.
String state = city_to_state_rlt.get("Los Angeles"); // returns "California
String country = state_to_county_rlt.get(state); // returns USA
Realistically though, this sounds like a database entity relation question, where you have entities for country, state and city and query these tables. Most databases are implemented using B-trees, which is a generalization of a binary search tree in that a node can have more than two children. You index the names of the countries, states, and cities for the given queries to search faster.
@oOZz
Solution using hashtables is an obvious one. Did you think about data redundancy in such a design? Its huge! Also synchronization is a big issue if you duplicate/triplicate data at several places.
@Apostle Yes, you have a good point. the question is asking the "best" data structure and that's vague. I gave a solution based on the time complexity.
I don't see any problems with synchronization, which can be handled easily. The data you're talking about is a very small amount. There are only 3200 cities, 812 states, and 196 countries in the world.
Oh! If your data is correct, it is indeed a small one. In that case, any data structure would do!
You should probably clarify with whoever is asking the question but in my experience "best" generally means time complexity and a hash table is frequently the best solution for fast lookup. This was basically the solution I was thinking of. In C++ it would be unordered_map country_states<string, vector<string> >, unordered_map state_cities<string, vector<string>.
It's worth pointing out that with the problem as described the second part of the hash map, i.e. the value as opposed to the key, doesn't actually have to be an array of strings, it could just be a comma or space separated string. That would simplify using the hash maps somewhat at the cost of losing some flexibility in handling the returned data. For example, if the client wished to list a country's states on separate lines they would have to tokenize the returned value themselves.
Symbol Graph is a form of undirected graph ,uses strings to define and refer to vertices.
it has following input format:
1.Vertex names are strings
2. A specified delimiter separates vertex names
3.Each line represents a set of edges, connecting the first vertex name on the line
to each of the other vertices named on the line
4.The number of vertices V and the number of edges E are both implicitly defined
please don't mislead by name symbol graph it is actually undirected graph with some modification to give answer of given type of question.
An M-Way search tree for storing the data and 3 different tries each one for Countries, State, and Cities will do the trick.
Let me brief it...
Structure for Data.
1). A tree node will have a string, a pointer to it's parent and a head pointer to it's childrens list.
2). Root node will have string and parent pointer as NULL. And a head pointer to a list of countries.
3). All country will store it's name in string ,parent pointer to Root node and a head ponter to a list of it's states.
4). same goes for states they will store head ponter of a list of its cities.
Structure For Query
1). We will create a TRIE for all the countries and at the end of each leaf we will store the corresponding address of that country node in our tree.
2). Same we will create 2 more TRIES for the States and Cities.
Query
1). To get all the states for a country we will first search for country in Countries TRIE and get the address of that node and will just print it's children. O(Length(Country Name))
2). same we will do to get all the cities for a state will search in States TRIE, obtain address of node print it's children. O(Length(State Name)).
3). To get the state and country for a city first we search for the city in Cities TRIE obtain address of node. It's Parent will be it's State and it's GrandParent will be it's country in which the city exist. O(Length(City Name))
First build a tree with the following hierarchy
1) Root node is just a dummy node(Probably dont even need this)
2) First level of tree is a list of countries(C1,C2 .. etc)
3) Second level is a list of states for each of the country. For eg S1,S2..Sn are n states(children) of C1 and so on. Each od the states has a parent pointer (to C1)
4) Third level is for cities. Each state in second level has all cities belonging to it as children. Each city also has a parent pointer back to its state.
Noe build an index structure for each of country, state and city (so 3 indexes). The key is the string (name of coutry , state or city) and value is appropriate pointer to the node in the tree.
Now lets answer each of the queries:
1) find list of states for a country.
First track the country node by looking up country index - O(1)
Every child of country node is states of that country. O(s) where s is number of states. Cant do better than this.
2) find list of cities for a state.
Similar to above.
3) find the name of the country and state for a city.
Given a city use, parent pointers to find state and city.
Disadvantage of this method:
1) Storage: Each node stores a parent pointer.
2) No name clash (there cannot be cities of same in 2 countries). This is assumption in problem too, otherwise query 3 is ambiguous.
Just realized my idea is similar to sanjay.rajputcse above. He is using trie and Im using 3 indexes(or hash tables).
One problem with combining all of cities,countries and states in one hash table or one trie is that you cant have same name for a city and state for example. New York is both a state and city (although technically NYC, but you get the idea). Better to keep them seperate. Maybe keep 3 different tries or hash tables
In questions like these, dont forget to ask what is the application for?
If performance is not important as in we are ok with less than a second response time and not saving milli seconds, then just put everything to a db. no coding required. can answer much more complicated queries.
even if the interviewer says performance is critical, you should mention this point to make sure you that you can think practically depending on project needs, deadline etc. atleast you should be aware of this, so you can make the right tradeoff depending on situation.
Map<String, Map<String, List<String>>> countriesStatesCities = new HashMap<String, Map<String,List<String>>>();
List <String> cities = new ArrayList<String>();
cities.add("Surat");
cities.add("Ahmedabad");
cities.add("Baroda");
List <String> cities1= new ArrayList<String>();
cities1.add("Bhopal");
cities1.add("Indore");
Map<String, List<String>> state = new HashMap<String, List<String>>();
state.put("Gujrat", cities);
state.put("MadhyaPradesh", cities1);
countriesStatesCities.put("India", state);
Iterator it = countriesStatesCities.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
System.out.println(pairs.getKey() + " -> " + pairs.getValue());
}
Map<String, Map<String, List<String>>> countriesStatesCities = new HashMap<String, Map<String,List<String>>>();
List <String> cities = new ArrayList<String>();
cities.add("Surat");
cities.add("Ahmedabad");
cities.add("Baroda");
List <String> cities1= new ArrayList<String>();
cities1.add("Bhopal");
cities1.add("Indore");
Map<String, List<String>> state = new HashMap<String, List<String>>();
state.put("Gujrat", cities);
state.put("MadhyaPradesh", cities1);
countriesStatesCities.put("India", state);
Iterator it = countriesStatesCities.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
System.out.println(pairs.getKey() + " -> " + pairs.getValue());
}
Map<String, Map<String, List<String>>> countriesStatesCities = new HashMap<String, Map<String,List<String>>>();
List <String> cities = new ArrayList<String>();
cities.add("Surat");
cities.add("Ahmedabad");
cities.add("Baroda");
List <String> cities1= new ArrayList<String>();
cities1.add("Bhopal");
cities1.add("Indore");
Map<String, List<String>> state = new HashMap<String, List<String>>();
state.put("Gujrat", cities);
state.put("MadhyaPradesh", cities1);
countriesStatesCities.put("India", state);
Iterator it = countriesStatesCities.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
System.out.println(pairs.getKey() + " -> " + pairs.getValue());
Map<String, Map<String, List<String>>> countriesStatesCities = new HashMap<String, Map<String,List<String>>>();
List <String> cities = new ArrayList<String>();
cities.add("Surat");
cities.add("Ahmedabad");
cities.add("Baroda");
List <String> cities1= new ArrayList<String>();
cities1.add("Bhopal");
cities1.add("Indore");
Map<String, List<String>> state = new HashMap<String, List<String>>();
state.put("Gujrat", cities);
state.put("MadhyaPradesh", cities1);
countriesStatesCities.put("India", state);
Iterator it = countriesStatesCities.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry)it.next();
System.out.println(pairs.getKey() + " -> " + pairs.getValue());
}
class Country
{
String countryName;
State slink_country;
Country clink_country;
Country(String countryName)
{
this.countryName=countryName;
}
}
class State
{
String stateName;
City citlink_state;
State slink_state;
State(String stateName)
{
this.stateName=stateName;
}
}
class City
{
String cityName;
City citlink_city;
City(String cityName)
{
this.cityName=cityName;
}
}
Directed Graph should do it .
Have directed edges from a country to states belong to that country and from a state to the cities present in that state.
This will be better for retrieving the data.
Linked Lists should be fine.
- Murali Mohan July 24, 2013A linked list of countries, where each node has two pointers
1. One that points to a linked list of states.
2. The other pointing to the next country node.
Each node in the state linked list will have 3 pointers
1. One pointing back to the it's country node.
2. Another pointing to a linked list of cities
3. The third pointing to the next state node.
Each city node will have 2 pointers
1. One that points back to it's state node
2. The other pointing to the next city node.
Queries 1 and 2 can be addressed in O(n) where n is the size of the country/state/city list
Query 3 can be addressed in O(1) time