Amazon Interview Question
Testing / Quality AssurancesRamu:
I think Linked List is not efficient because, how would you get the a particular line directly; you need to traverse the from the starting of the list to get it.
I was using map initially. But I changed my mind now. I think Linked List should be better. If you use map, it is easy to find paticular line. However, when you want to delete or add a line, you have to change all the line numbers afterward. Actually, when you edit in a Notepad, you usually don't use line number. You just move from the first line and sequently search until the line you want to edit. This is exactly what you do with a Linked List.
Yes, doubly linked lists should work!
Also, if we are to maintain line numbers, on inserting a new line somewhere in between, we need to update the line numbers on all the lines following it. This is linear ofcourse.
We could use an index on line numbers. So we could retrieve the line in constant time from this index. However, on inserting a new line we need to update the index. This is linear again.
I would probably design a class with following minimal properties
1. a string to represent the name of the notepad
2. a hash map with key as line number and value as the entire string.
Methods for the Notepad class:
1. edit(int line_number): this function should pull up the string from hash map based on the line number and re-write the entry.
Above proposed method is just a skeleton, there could be many more additions to it.
I would suggest that we use a linked list to store the element that is strings. Then we would also store the strings in a suffix tree. This tree not only has the words but also has the value that indicates the position of the string stored in the linked list. Thus it would become easy to search a string and also store the data.
This is a memory consuming solution but effective in actions like cut, copy, paste and also in searching in notepad.
I do not understand why one should load the whole content in a map (hashtable or tree etc). Even though it will give you time-efficiency but think of the memory requirement. Imagine you want to view a production-log file of huge size, say 4GB. Do you think you can load that in RAM. In fact Microsoft Notepad suffers this size problem.
My proposal is to trade-off this time-efficiency at the cost of handling unlimited size files. To do that, I'll keep the current cursor position in a variable. As you type, the cursor moves and the variable value increases.
Moreover to support other functionalities like, replace, insert, delete, append (similar to insert) and undo, I would like to have a data structure say, action. An 'action' might have three attributes - positionForAction, stringValue and actionType. So for invoking each method like replace, insert, delete etc you pass an action instance as method argument.
For undo, we need to store previous state for each action. One way to deal this is to store the undo action in a stack for each action. For example, if you delete char 'P' at location 1243, then create an action instance (positionForAction=1234, string='P', actionTpye=insert) and store it in a stack.
Search is another must-have feature and for that I would implement a good string pattern search algorithm such as Knuth-Morris-Pratt Algorithm.
ROPE data structure is used generally for the text editors
- Ashupriya July 11, 2012