Citigroup Interview Question for Financial Software Developers


Country: United States




Comment hidden because of low score. Click to expand.
2
of 2 vote

Step: declare a min heap,say minheap, of 10 string items and each of its node contains a variable -count. This count is used to keep the heap property. Declare another hashmap,say hmap, with string as its key and count of string as its value. Declare a 256 int array,say chCount, if the strings are only contains ascci characters else declare 16 bit int array.
Step 2: scan each characters from the file, update chCount for each of them, when a word occurs, insert/update(update hmap if this word already in hmap) into hmap and increment its value by one.Then compare this word with top element of minheap. If the count variable of top element of minheap is less than value of hmap's present word , then if it is not in minheap replace the min element with new element from the hmap.If this element is already in minheap, then just increment its count and re-adjust heap property.

Finally we will get minheap with 10 elements and character count in chCount array.

- BVarghese April 21, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

a) Have a hashmap with <string,count> and min heap. While reading every word,increment the count of word in the hashmap and check the head of min heap.If the count of the head is less the count of current word,remove the head and add the current word to head.

b) Hashmap of <letters,count>

- Dhass April 20, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
2
of 2 votes

How can you check if the word you added in the heap after remove the min is not already exist in the heap. for example if you have only 2 words {ab,6} , {ac,3} so min heap head will be {ac,3} if you countered ab you will increase ab with 1 to be{ab,7} comparing it with {ac,3} which is > so remove head {ac} and insert {ab} so heap now contains repeated strings

- spik August 30, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

a) build a heap of word count
b) take a array of 256. now read every word and increment the respective character

- DashDash April 20, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

A heap isn't an adequate data structure for counting occurrences. How will you efficiently update occurrence counts as you go through the file? Note that your data structure will need to be much bigger than 10 elements, since you will also need to keep track of words that are not in the top 10 in case later occurrences of those words bring them into the top 10.

- eugene.yarovoi April 20, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 votes

Regarding a) :
1) Create a trie and count each word in it.
2) Create a min-heap according to counters in trie

- Anonymous July 03, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

a) build a heap of word count
b) take a array of 256. now read every word and increment the respective character

- DashDash April 20, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I will read character by character and store the character count in an array , and the word count in a map.
I have written the code for file containing a,b or c . Here is the code:

package fileio;

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class WordCount {

	public static void processFile(){
		Map<String, Integer> m = new HashMap<String, Integer>();
		
		BufferedInputStream bis =null;
		try{
		bis = new BufferedInputStream(new FileInputStream("E:/app.txt"));
		}catch(FileNotFoundException e){
			System.out.println(e);
		}
		
		int chars[]=new int[27];
		StringBuilder sb = new StringBuilder();
		String temp=null;
		
		int ch;
		try{
		while((ch=bis.read())!=-1){
//			System.out.println((char)ch);
			switch(ch){
			case 'a':
			case 'A':
				chars[1]++;
				sb.append((char)ch);
				break;
			case 'b':
			case 'B':
				chars[2]++;
				sb.append((char)ch);
				break;
			case 'c':
			case 'C':
				chars[3]++;
				sb.append((char)ch);
				break;
			case ' ':
				if(sb.length()>0){
					
				temp=sb.toString();
//				System.out.println(temp);
				sb.delete(0, sb.length());
//				System.out.println("sb:"+sb);
				}
				if(m.containsKey(temp)){
					int count=m.get(temp);
					count++;
					m.put(temp, count);
				}else{
					m.put(temp, 1);
				}
			default:
				if(sb.length()>0)
					sb.delete(0, sb.length());
				
			}
		}
		}catch(Exception e){
			System.out.println(e);
		}
		
		System.out.println("A["+chars[1]+"]");
		System.out.println("B["+chars[2]+"]");
		System.out.println("C["+chars[3]+"]");
		System.out.println(m);
		
		Map<String,Integer> topwords = new LinkedHashMap<String,Integer>();
		String tempStr="";
		for (int i = 0; i < 3; i++) {
			int temp1=0;
			Set<String> keys = m.keySet();
			for(String temp2:keys){
				if(m.get(temp2)>temp1){
					tempStr=temp2;
					temp1=m.get(temp2);
				}
			}
			topwords.put(tempStr,m.get(tempStr));
			m.remove(tempStr);
		}
		System.out.println(topwords);
		
	}
	public static void main(String[] args) {
		processFile();
	}
}

- praveenkcs28 April 20, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
1
of 1 vote

As file size is very very big, say more than 2 GB, I feel above code is not much efficient.It will take longer time. Instead reading a single file , I would split the file into n number of small files based on the line count. Then , will run your processing logic in a multithread environment , will use fixed thread pool instead of creating many threads. I will use concurrent hash map instead HashMap<String,Interger> to store the word and count.

- cCAACc April 20, 2013 | Flag
Comment hidden because of low score. Click to expand.
-1
of 1 vote

As you say, the file size is big, so splitting the file to multiple chunks will not speed up the whole processing time because the program is I/O intensive itself. However, splitting it into 2 or 3 can be a good idea even tough the OS is supposed to handle that part.

- nobody April 24, 2013 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

Use a TRIE datastructure. It is also called prefix trees or digital search trees.
This will be the optimized datastructure when many words are inserted and indexed across.

Insertion/Retrieval/Deletion Time : O(k), where k is the length of the string

- Sudhindra.A April 22, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Partition the file using uniform hash into N/M files (where M is the size of main memory), so each file will fit in memory.
Read each file, and count the words using a hash table. Iterate on the hash-table and find the top k entries by using a min-heap (replace the min item in the heap if the current word count is higher).
Repeat for all the partitions and keep updating the heap to keep track of the top ranked words.

- gen-y-s April 22, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

this question can be easily solved in python because python allowed string as array index then we can just increase the count by looking at it's index.

- Anonymous June 18, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

If Size is very big than we have to use the NIO package with channel, buffer and read the file with chunks of file an chunk size means buffer size.

package com.first;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.RandomAccessFile;
import java.io.Reader;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
import java.util.StringTokenizer;

public class FrequencyWord {
public static void main(String[] args) {

BufferedReader br = null;
StringTokenizer st;
HashMap<String, Integer> hm=new HashMap<String, Integer>();
try {
FileReader f=new FileReader("E:\\java Preparation\\click me\\Rescued.txt");
// InputStream is=new FileInputStream(f);
//System.out.println(is.read());
br = new BufferedReader(f);
String sCurrentLine;
String word;
int count;
while ((sCurrentLine = br.readLine()) != null) {

st=new StringTokenizer(sCurrentLine," ");
while(st.hasMoreTokens())
{
word=st.nextToken();
if(hm.containsKey(word))
{
count=hm.get(word);
count++;
hm.put(word,count);
}
else
{
hm.put(word, 1);
}
}
System.out.println(sCurrentLine);
System.out.println("********************");

}

System.out.println(hm.size());
Set<String> keyset=hm.keySet();
Iterator it=keyset.iterator();
String key;
int maxfreq=1;
String maxword = null;
int freq;
while(it.hasNext())
{
key=(String) it.next();
freq=hm.get(key);
if(freq>maxfreq)
{
maxfreq=freq;
maxword=key;

}

}
System.out.println(maxword+" : "+maxfreq);
} catch (FileNotFoundException e) {
e.printStackTrace();
}catch (IOException e)
{
e.printStackTrace();
}

}
}

I have tested it with the file have 50000 words and output is "the " as highest

- rathor.rajeev June 30, 2013 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Create 2 hash-tables one contains the first 26 letters of english alphabet a-z as keys and value field counts the number of occurrences of each alphabet in the file and the second hash-table , now we start reading the file and for ever alphabet we read we will update the alphabet has table and for every word we will update the second has table if there is a collision we will increment the value of the colliding word if there is no collision we create a new entry with the word in to the table with value equal to one.

- Roshan December 19, 2013 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More