There are lots of string in a

Amazon Interview Question for SDE-2s

-2

of 2 votes

22
Answers
There are lots of string in a file. Find the longest string that could be made from the other strings in the file.

Eg.
the
there
after
thereafter
reaf

ans.
thereafter
- JDev April 26, 2013 in India | Report Duplicate | Flag | PURGE
Amazon SDE-2 Algorithm

Email me when people comment.

An error occurred in subscribing you.

Country: India
Interview Type: In-Person

Email me when people comment.

An error occurred in subscribing you.

Comment hidden because of low score. Click to expand.

of 2 vote

Well if I understand it correctly then I think it is asked to find the longest string "in the file" that could be made from the other strings in that same file.

Going with that interpretation, one way to solve this problem would be to first create a sorted (based on string lengths) array of the strings in the file. Then start with the largest string in that array. In this case 'Thereafter'. Get the first character in that string i.e. 'T' and see if there are any other strings that start with it. We find: 'the' and 'there'.

Now we need to evaluate these two separately and see if we can form the entire word. If we go with 'the' the next word we should search for should start with 'R'. And we find 'reaf'. Now 'the' and 'reaf' give us just part of 'thereafter'. But now there is no string that matches the remainder of 'thereafter' (i.e. 'ter'). So we discard 'The'.

Now if we go with 'there', the next string that we should search for should start with 'A'. And we find that we have 'after' that matches with the remainder of the 'thereafter' (i.e. 'after').

A better solution can be worked out for this but this is what came to my mind immediately after reading the problem.

- MM April 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

small modification is needed ..
if we can't able to find a word at some point of time, we have to discard the latest word which we committed ....and go with other alternatives ... recursion is the best way to do this...
And for searching whether the word is present with some letter or not, trie is the best suitable data structure(searching can be done in amortizedO(1))....

If someone gives best psuedo code .. that will be appreciated :)

- bharat April 30, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Question is bit ambiguous coz we can have a questions like how many times we can reuse the given words? or we need to merge at-most 2 words to make the new word?

- hprem991 April 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

Good point. So if we have to use all the words - then result is either the longest word (if it includes all other words) or nothing. If we have to use just some words - it is still the longest word because it includes at least itself.

- chatbot April 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Good point. Forgot to add, assume one word occurs only once.
There could be n number of words that could form a longer word.

Eg.
the
there
after
thereafterheran
reaf
he
ran

ans.
thereafterheran

- JDev April 29, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

Trie will give you the answer. Just need to build a trie

- DashDash April 26, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

How do you use a Trie for this problem?

- DigitalFire April 26, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 votes

Oh I am sorry...Made a mistake in reading the question.

- DashDash April 27, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

public String FindLongestCompositeString(java.io.File StringFile) throws FileNotFoundException, IOException {
        java.io.BufferedReader fin = new java.io.BufferedReader(new java.io.FileReader(StringFile));
        java.util.List<String> inputStrings = new java.util.ArrayList<String>();
        while (fin.ready()) {
            inputStrings.add(fin.readLine().trim());
        }
        return FindLongestCompositeString(inputStrings);
    }

    public String FindLongestCompositeString(java.util.List<String> inWords) {
        java.util.List<String> tempWords = new java.util.ArrayList<String>(inWords);
        java.util.Collections.sort(tempWords, new java.util.Comparator<String>() {
            @Override
            public int compare(String o1, String o2) {
                if (o1.length() == o2.length()) {
                    return o1.compareTo(o2);
                } else if (o1.length() > o2.length()) {
                    return -1;
                } else {
                    return 1;
                }
            }
        });
        while (tempWords.size() > 1) {
            String currentpossible = tempWords.get(0);
            tempWords.remove(0);
            if (findParts(currentpossible, tempWords)) {
                return currentpossible;
            }
        }
        return null;
    }

    public boolean findParts(String inPossible, java.util.List<String> inputStrings) {
        if (inPossible.length() == 0) {
            return true;
        }
        for (String curPart : inputStrings) {
            if (inPossible.startsWith(curPart)) {
                if (findParts(inPossible.substring(curPart.length(), inPossible.length()), inputStrings)) {
                    return true;
                }
            }
        }
        return false;
    }

- miiohau April 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

sort the strings based on length - O(nlogn) descending order
take first string, build Suffix tree and check remaining all remaining strings exists or not -O(n)
consider building suffix tree as constant
now leave first string and build suffix tree for second if all the strings are not exists in first string and repeat the same till the end of the all the strings -O(npow2)

- Anonymous April 30, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

char[] bigWordArr = bigWord.toCharArray();
booean[] visited = new boolean[bigWordArr.length];
for( int i =0; i <bigWordArr.size; i++){
	boolean found = false; int temp;
	for(int j =0; j <bigWordArr.size; j++){
		found = lookUpInTries(bigWord.subString(i,j)); temp =j;
	}
	if(found) {
		for(int j =i; j <=temp; j++){				
                     if(visited[j]) continue; 
		     else visited[j] = true;
		}
	}
}
for( int i =0; i <visited.size; i++){
	if(!visited[i]){ syso("Not a real big word"); break;
}
// build a tries if not availble

- Ashis Kumar May 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

for(int j =0; j <bigWordArr.size; j++){
		found = lookUpInTries(bigWord.subString(i,j)); 
		if(found) {
			for(int k =i; k <=j; k++){				
                     		if(visited[k]) continue; 
		     		else visited[k] = true;
			}
		}
	}

- Ashis Kumar May 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 2 vote

1. Sort the strings in the file, based on their length.
2. Build a common suffix tree for all these words, in the above sorted order.
3. If a word can be inserted to the suffix tree without adding any direct child to the root node,
then it means that it is a combination of two words that have been already added to the suffix tree.

For eg: if we have inserted 'there' and 'after' to the suffix tree, then for adding 'thereafter' to the suffix tree we don't need to add a direct child to the root node of the suffix tree. So it is a combination of two other words.

Complexity : O(nlogn) considering that building suffix tree would take O(n) time.

- CodePredator May 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

How do you know that the second word is also valid? Eg Therefore will also be accepted by your logic.

- alex May 11, 2013 | Flag

Comment hidden because of low score. Click to expand.

of 0 vote

#include<stdio.h>
#include<string.h>
#include<stdlib.h>
main()
{
char a[50],b[20],c[20];
int i,j=0,l=0;
printf("Enter a string:\n");
gets(a);
for(i=0;i<=strlen(a);i++)
{
if(a[i]!=32 && a[i]!='\0')
b[j++]=a[i];
else
{
b[j]='\0';
if(strlen(b)>l)
{
strcpy(c,b);
l=strlen(b);
}
j=0;
}
}
printf("The longest word is:");
puts(c);
}

- Silambarasan MCA May 09, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

can we create a hash table and put all the strings in that(calculate the index by summing up the chars of the strings). Start with the longest string (sorting can be done) or simply with the first string. suppose i have "there", i will first check whether t is present or not, then i will check whether "th" is present or not and so on. complexity will be O(n*m) n=no.of strings m=total no. of chars in a string

- Is_it? June 16, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

- Sort the list, from largest to smallest, and build a hash map
- Scan through sorted list, and for each string split it into two parts (scan through linearly) and check if these two parts are present in the hash map.

- IntwPrep July 21, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

I think a sorting and trie should give the solution
1.Sort the dictionary in ascending order by string size
2.Starting from the smallest word, start inserting into the TRIE
3.If while inserting a word, we have not added any nodes and reach a dead end, take a pointer to the root, and as we insert the new nodes for the longer, move down the TRIE as long as we keep getting a match
4.When we insert the last node of the trie, we still have a matching character from the search, save this word and the length
5. do this till the end of the dictionary
6.Print the longest of the saved word

- Praveen August 12, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 vote

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.PriorityQueue;
import java.util.Scanner;

public class LongestCombinationString {
    public static void main(String[] args) throws FileNotFoundException {
        Scanner scanner = new Scanner(new FileInputStream("src/main/resources/words.txt"));
        SuffixTree suffixTree = new SuffixTree();
        PriorityQueue<String> queue = new PriorityQueue<>((s, t) -> t.length() - s.length());
        while (scanner.hasNext()) {
            String input = scanner.nextLine();
            suffixTree.add(input);
            queue.add(input);
        }
        while (!queue.isEmpty()) {
            String word = queue.poll();
            if(suffixTree.compositionSearch(word)) {
                System.out.println(word.length());
                break;
            }
        }
    }
}


import java.util.HashMap;
import java.util.Map;

public class SuffixTree {
    SuffixTreeNode root;
    public SuffixTree() {
        root = new SuffixTreeNode(true, null, false);
    }

    private boolean compositionSearch(final String reversedInput, int startIndex, boolean compositionSearch) {
        SuffixTreeNode current = this.root;
        while(startIndex < reversedInput.length()) {
            Character ch = reversedInput.charAt(startIndex);
            if (current.getChildren() == null) {
                return false;
            }
            if (current.getChildren().containsKey(ch)) {
                SuffixTreeNode treeNode = current.getChildren().get(ch);
                if (treeNode.isLeaf()) {
                    if (startIndex == reversedInput.length() -1) {
                        return compositionSearch;
                    }
                    boolean compositionFound = compositionSearch(reversedInput, startIndex + 1, true);
                    if (compositionFound) {
                        return compositionFound;
                    }
                }
                startIndex = startIndex + 1;
                current = current.getChildren().get(ch);
            } else {
                return false;
            }
        }
        return false;
    }

    public boolean compositionSearch(String input) {
        return compositionSearch(new StringBuilder(input).reverse().toString(), 0, false);
    }

    public boolean search(final String input) {
        SuffixTreeNode current = this.root;
        for(Character ch: new StringBuilder(input).reverse().toString().toCharArray()) {
            if (current.getChildren() == null) {
                return false;
            }

            if (current.getChildren().containsKey(ch)) {
                current = current.getChildren().get(ch);
            } else {
                return false;
            }
        }
        return current.isLeaf();
    }

    public void add(final String input) {
        if (input.isEmpty()) {
            this.root.setLeaf(true);
        }

        SuffixTreeNode current = this.root;

        for(Character ch: new StringBuilder(input).reverse().toString().toCharArray()) {
            if (current.getChildren() == null) {
                current.setChildren(new HashMap<>());
            }

            if (current.getChildren().containsKey(ch)) {
                current = current.getChildren().get(ch);
            } else {
                SuffixTreeNode temp = new SuffixTreeNode(false, null, false);
                current.getChildren().put(ch, temp);
                current = temp;
            }
        }
        current.setLeaf(true);
    }
}

class SuffixTreeNode {
    private boolean isRoot;
    private Map<Character, SuffixTreeNode> children;
    private boolean isLeaf;

    public void setRoot(boolean root) {
        isRoot = root;
    }

    public Map<Character, SuffixTreeNode> getChildren() {
        return children;
    }

    public void setLeaf(boolean leaf) {
        isLeaf = leaf;
    }

    public boolean isRoot() {
        return isRoot;
    }

    public void setChildren(Map<Character, SuffixTreeNode> children) {
        this.children = children;
    }

    public boolean isLeaf() {
        return isLeaf;
    }

    public SuffixTreeNode(boolean isRoot, Map<Character, SuffixTreeNode> children, boolean isLeaf) {
        this.isRoot = isRoot;
        this.children = children;
        this.isLeaf = isLeaf;
    }
}

- imaniche March 01, 2020 | Flag Reply

Comment hidden because of low score. Click to expand.

-1

of 1 vote

I am thinking of a directed graph.
Given 2 strings, A and B, there exists an edge A -> B iff A \subset B. We do this for every pair of words in the file. Edge weights are always 1, and edges are directed.
This gives an adjacency matrix which is not symmetric.
Now,
Calculate sum of all columns of this matrix, and pick the word corresponding to the maximum sum. This is the word which has the most substrings from within the file.
Eg:
the -> there
the -> thereafter
there -> thereafter
after ->thereafter
thereafter
reaf ->thereafter
Now, if we form a 6x6 adjacency matrix corresponding to the above graph (directed), the sum of the column corresponding to thereafter will have the maximum value.

- tejaswini May 01, 2013 | Flag Reply

Comment hidden because of low score. Click to expand.

of 0 votes

The columns need to store the length of the substring that actually form the string.
If we go with the notation that if the -> thereafter =1 else 0, we wont get the exact solution.
Although the string may contain maximum substring, but it may not be the longest word.
ex.
the
there
after
thereafter
reaf
a
b
c
abc

Your solution will give abc as the answer because it has maximum number of substring(a,b,c), but the correct answer is "thereafter" because it has the maximum length.

- Anonymous May 01, 2013 | Flag

Comment hidden because of low score. Click to expand.

-2

of 2 vote

Does string means valid dictionary word?
If not answer is obvious!!!

- anil2878 April 26, 2013 | Flag Reply

CareerCup

Amazon Interview Question for SDE-2s

Books

Videos

Resume Review

Mock Interviews