## Facebook Interview Question

Software Engineer / Developers**Country:**United States

**Interview Type:**In-Person

If there are 'n' papers in total, this problem can be solved in O(n) with space complexity of O(n). Note that, h-index can be between 0 to n. Say if the h-index is 10, this means, there has to be 10 papers with citation count >= 10. So if we can find out the number of papers with citations >=X for every X (and store it in an array C) where X ranges between 0 to n, then by scanning the count array C from the right to left, we can find the h-index at index i where i == C[i].

Pseudocode:

input array A of length n.

- init array C[0] to C[n] with 0

- foreach p in A, if p >= n, c[n]++; else c[p] +=1

- for i=n-1 to 0, c[i]=c[i]+c[i+1]

- for i=n to 0, if c[i] == i return i

Binary search for the first part as mentioned by @gnahzy:

public static void main(String[ ] args) {

int [ ] sortedCitation = {21, 17, 12, 2, 2, 1};

System.out.println("The h-index value is: " + getHindexFromSorted(sortedCitation));

}

private static int getHindexFromSorted(int[ ] citation) {

int low = 0; int high = citation.length - 1;

int idx = (low+high)/2;

while(low <= high) {

if(citation[idx] >= idx + 1) {

low = idx + 1;

}

else {

high = idx - 1;

}

idx = (low+high)/2;

}

return idx + 1;

}

@Cole's idea in Java:

private static int getHindexFromUnsorted(int[] citation) {

int[] hindex = new int[citation.length + 1];

for (int i = 0; i < citation.length; i++) {

if (citation[i] >= citation.length) {

hindex[citation.length]++;

} else {

hindex[citation[i]]++;

}

}

int count = 0;

for (int i = hindex.length - 1; i > 0; i--) {

if (i >= count) {

count += hindex[i];

}

}

return count;

}

1. we can solve this use binary search, which is O(logN).

2. implement @Cole 's idea in python, which is O(n)

```
def get_hindex_unsorted2(array):
# the trick is the h index can not be greater than either the size of the array
# or the max citation in the paper list.
max_hindex = min(len(array), max(array))
# citations[i] = the number of paper with citation being i
# for citation > max_hindex, we will put it into citations[max_hindex]
citations = [0] * (max_hindex + 1)
for x in array:
citations[min(x, max_hindex)] += 1
# paper count = the number of papers with citations greater than i
paper_count = 0
for i in range(max_hindex, -1, -1):
paper_count += citations[i]
if paper_count >= i:
# we find more than i papers with citations greater than i
return i
```

```
def get_hindex_sorted(array):
return get_hindex_helper(array, 0, len(array) - 1)
def get_hindex_helper(array, left, right):
assert left <= right
if left == right:
return min(len(array) - left, array[left])
mid = (left + right)//2
n = len(array) - mid
m = array[mid]
#there are n papers whose citations are at least m.
if n == m:
return n
elif m > n:
return get_hindex_helper(array, left, mid)
else: # n > m
return get_hindex_helper(array, mid + 1, right)
```

1. binary-search (O(log(n)). If citations[i] >= i then h >= i (if array's in descending order).

2. Here's a O(n) time & space solution in ruby. The trick is you can ignore citation-counts larger than n.

- Cole September 07, 2012