## Google Interview Question

SDE1s**Country:**United States

Step 2 is linear time, btw: we just run a depth-first search, which is linear because we have O(N) edges.

Why we should remove the edge from 1.

And How your solution work for sample test case 2.

please elaborate

For sample2: there is only one connected component, but no pointer points to node 1, so we need to change one node in this case.

In summary:

1. Find connected components

2. Categorize connected components into three:

1). component that doesn't contain node1

2). component that contain node 1, but doesn't have a node point to it

3). component that contain node 1, but do have a node point to it

3. so the total number is:

1) + 2)

I think this problem can be reduced to finding a spanning dfs forest with minimum number of trees.

1. First, we'll create a new graph by reversing all the edges in original input graph, this graph will simply be denoted by G. Notice that all the vertices reachable from 1 in G are "good vertices".

2. We'll build a stack of the roots of the dfs forest which is the result of running dfs on G starting from the vertex 1 and when you need to select a new root, select the minimum possible (notice that if all the vertices are reachable from 1 then instead of a forest we have a single tree). Whenever we encounter a new root throughout the dfs run we'll just push it to the stack.

3. We'll run dfs again on G but this time, instead of starting from 1, we'll use the roots stack from (2) whenever we need a new root to traverse from (roots that were already encountered throughout the dfs run will just be popped and ignored). Throughout this dfs run we'll yet again build a root stack of the roots in the dfs forest (for this dfs run).

4. The resulting root stack from step (3) should contain the minimum number of trees in the dfs forest for the graph G (considering the special structure of G - exactly 1 incoming edge to each vertex). The result is the number of roots in the stack from (3) minus 1 if 1 is a root as well.

Complexity: Two dfs runs - O(n) (the number of vertices and edges is O(n)).

Code: snipt.org/BihG4

glebstepanov1992, in my code I count from 0 to n-1 instead of from 1 to n. So the input for the example above should be {1,2,3,1,5,4} which yields the output 2 as desired.

Anyway, Julian offered a much more elegant solution below whose correctness is based on the following claims:

1. Let C be a (weakly) connected component in a directed graph. Suppose the incoming degree of each vertex in C is exactly 1. Then there exists some vertex r in C from for which every other vertex v in C is reachable from (in the directed graph). In other words, r is a root of C. This can be proved using the correctness of the algorithm for finding a root in a directed graph.

2. Let C be a (weakly) connected component in a directed graph. Suppose the incoming degree of every vertex in C, except some vertex r, is exactly 1 while r's incoming degree is 0. Then r is a root of C in the directed graph. This can be proved by simply determining the orientation of each edge in the undirected path from r to every other vertex in C (using the fact that the incoming degree is exactly 1 for every other vertex).

Removing the outgoing edge from the vertex 1 means that the vertex 1 has an incoming degree of 0 in the edge reversed graph while all the other vertices have an incoming degree of exactly 1. The claims above imply that by finding the number of (weakly) connected components in the new directed graph we'll be able to determine the minimum amount of roots in a dfs spanning forest where 1 is a root as well. So the answer would be the number of connected components minus 1 (for the component of the vertex 1) as Julian noted.

It is so simple why people are even using any algorithms

```
#include <stdio.h>
int main() {
int n,i,ans=0;
scanf("%d",&n);
int a;
int c[1000]={0};
for(i=0;i<n;i++){
scanf("%d",&a);
c[a-1]=1;
}
for(i=1;i<1000;i++){
if(c[i]==1)
ans++;
}
printf("%d",ans);
return 0;
}
```

1. build graphs: label -> Node

2. remove all nodes point to(directly or in-directly) #1

3. count remaining independent graphs

```
import java.util.*;
class GoodNodes{
public static void main(String[] args){
//int[] array = {1,2,3,4,5};
int[] array = {5,5,5,5,5};
GoodNodes s = new GoodNodes();
int rst = s.needChange(array);
Helper.p(rst);
}
int needChange(int[] array){
//1. build graphs: label -> Node
Map<Integer, Node> map = buildMap(array);
//remove all nodes point to(directly or in-directly) #1
Set<Node> remaining = removeGoodNodes(map);
//count remaining independent graphs
int rst = countNumOfGraphs(remaining);
return rst;
}
Map<Integer, Node> buildMap(int[] array){
Map<Integer, Node> map = new HashMap<Integer, Node>();
for(int i = 0; i < array.length; i++){
int cur = i + 1;
int to = array[i];
if(cur == to){
Node node = map.containsKey(cur)? map.get(cur):new Node(cur);
map.put(cur, node);
}else{
Node curNode = map.containsKey(cur)? map.get(cur):new Node(cur);
Node toNode = map.containsKey(to)? map.get(to): new Node(to);
curNode.out = toNode;
toNode.ins.add(curNode);
map.put(cur, curNode);
map.put(to, toNode);
}
}
return map;
}
Set<Node> removeGoodNodes(Map<Integer, Node> map){
Node origin = map.get(1);
Set<Integer> visited = new HashSet<Integer>();
LinkedList<Integer> q = new LinkedList<Integer>();
visited.add(1);
q.add(1);
while(!q.isEmpty()){
int cur = q.remove();
Node node = map.get(cur);
for(Node n: node.ins){
int label = n.label;
if(!visited.contains(label)){
visited.add(label);
q.add(label);
}
}
}
Set<Node> rst = new HashSet<Node>();
for(int i = 1; i <= map.size(); i++){
if(!visited.contains(i)){
rst.add(map.get(i));
}
}
return rst;
}
//count piece of parts
int countNumOfGraphs(Set<Node> nodes){
int rst = 0;
//bfs
Set<Node> visited = new HashSet<Node>();
for(Node node: nodes){
if(!visited.contains(node)){
rst++;
LinkedList<Node> q = new LinkedList<Node>();
visited.add(node);
q.add(node);
while(!q.isEmpty()){
Node cur = q.remove();
Node out = cur.out;
if(out != null && nodes.contains(out) && !visited.contains(out)){
visited.add(out);
q.add(out);
}
for(Node in : cur.ins){
if(nodes.contains(in) && !visited.contains(in)){
visited.add(in);
q.add(in);
}
}
}
}
}
return rst;
}
class Node{
int label;
Node out;
List<Node> ins;
public Node(int label){
this.label = label;
this.ins = new ArrayList<Node>();
}
}
}
//TEST
//1,2,3,4,5 => 4
//5,5,5,5,5 => 1
```

```
public class GoodBadNodes {
public static void main(String... args) {
System.out.println(goodNodes(new int[]{5, 3, 1, 5, 2}));
}
static int goodNodes(int[] points_to) {
//there is one "good" community and some "bad" comunities
//the answer to the problen is the number of isolated bad communities.
//or at least i suppose so...
Map<Integer, Set<Integer>> communities = new HashMap<>();
communities.put(1, new HashSet<>());
for (int i = 1; i < points_to.length; i++) {
Set<Integer> community = communities.get(points_to[i]);
if (community == null) {
community = new HashSet<>();
communities.put(points_to[i], community);
}
community.add(i);
}
//make communities larger until they don't change
boolean changed;
do {
changed = false;
for (Iterator<Integer> i = communities.keySet().iterator(); i.hasNext(); ) {
int index = i.next();
if (index == 1) {
continue;
}
Set<Integer> community = communities.get(index);
//if points to itself - continue
if (points_to[index - 1] == index) {
continue;
}
Set<Integer> newCommunity = communities.get(points_to[index - 1]);
if (newCommunity != null) {
//make old community members to point to the new community
for (Integer member : community) {
points_to[member] = points_to[index - 1];
}
newCommunity.addAll(community);
i.remove();
changed = true;
}
}
} while (changed);
return communities.size() - 1;
}
}
```

}

1. Find all good nodes from the begging.( BFS or DFS)

2. Then Find all strong connected components. From those node, who arent good.

Answer uis a number of SCC.

keep a map M of lists such that map.get(i) = lists of nodes that points to (i) minus the i-th node;

also keep a set S of good nodes, initialize with the tail node;

then starting from the list of nodes that point to the tail node, proceed the lists and mark every node a good node (put it in the set S); repeat this step with nodes in the list;

finish when every node in the list is already a good node or the list is empty;

```
public static int nChangesToMakeAllNodesGood(List<Integer> nodes_list)
{
if(nodes_list.isEmpty())
return 0;
// list of marked items
List<Integer> visited_list = new ArrayList<Integer>(nodes_list.size());
for(int i = 0; i < visited_list.size(); ++i)
visited_list.set(i, -1);
int ncomponents = 1, npass = 0;
visited_list.set(0, 0);
for(int i = 0; i < nodes_list.size(); ++i)
{
if( isNewComponent(nodes_list, visited_list, i, npass++) )
++ncomponents;
}
return ncomponents-1;
}
public static Boolean isNewComponent(List<Integer> nodes_list, List<Integer> visited_list, int i, int cur_pass)
{
// delete link from first node: if first node, then it's an old component and return
if(i == 0)
return false;
// found node visited on previous passes: old component
if(visited_list.get(i) > 0 && visited_list.get(i) != cur_pass )
return false;
// found node visited in this pass: new component
if(visited_list.get(i) == cur_pass)
return true;
// current node hasn't been visited: mark as cur_pass
visited_list.set(i, cur_pass);
return isNewComponent(nodes_list, visited_list, nodes_list.get(i), cur_pass);
}
```

I think this one works. I haven't studied much Graph theory but this was what came to my mind.

```
def find_root(arr, i):
if(arr[i] == i):
return i
return find_root(arr, arr[i])
def count_max_nodes(arr):
for i in range(len(arr)):
arr[i] = arr[i] - 1
temp = {}
for i in range(1,len(arr)):
root = find_root(arr,i)
if root != 0:
temp[root] = 1
print(len(temp))
count_max_nodes([1,2,3,4,5])
count_max_nodes([5,5,5,5,5])
count_max_nodes([1,5,1,2,5])
```

1) assign each node with its idx, parent, rank(initially will be '0')

example. for input 1, 2, 3, 3, 5 , it will be {[1, 1, 0], [2, 2, 0], [3, 3, 0], [4, 3, 0], [5, 5, 0]}

2) go through the list and assign rank by finding parent of each node

for the above input, it will be {[1, 1, 1], [2, 2, 1], [3, 3, 2], [4, 3, 0], [5, 5, 0]}

3) Now go through the list and count non zero rank nodes except first node(its already tail node).

for the above input, it will be 2, 3, 5 with non zero ranks, so the count is 3.. you need to change nodes to make all nodes to be good node.

Based on @Julian's idea, the solution can be based on union-find method. union all connected nodes. count total groups number M. if there's a group which the good node is the root, than return M-1, else return M.

An optimised union-find algorithms takes almost o(n) to union all nodes. So the time and space complexity of this algorithms is o(n)

```
class Solution{
int[] list=null;
int count=0;
public int getMinNodesToChange(int[] nodes){
int N=nodes.length;
count=N;
list=new int[N+1];
for (int i=1; i<=N; i++) list[i]=i;
int flag=0;
for (int i=1; i<=N; i++){
if (nodes[i-1]==1) flag=1;
union (i, nodes[i-1]);
}
return count - flag;
}
public void union(int i, int j){
int rootI=findRoot(i);
int rootJ=findRoot(j);
if (rootI==rootJ) return;
list[rootJ]=rootI;
count--;
}
public int findRoot(int i){
while (list[i]!=i){
int tmp=list[i];
list[i]=list[tmp];
i=tmp;
}
return i;
}
}
```

that was just max 15-20 line code .. dont know y u guys writing large program

#include<iostream>

using namespace std;

int fun(int arr[],int k,int []);

int main()

{

int n;

cin>>n;

int arr[n+1]={0},ar[n+1]={0};

for(int i=1;i<=n;i++)

cin>>arr[i];

arr[1]=-1;

int sum=0;

for(int i=2;i<=n;i++)

if(fun(arr,i,ar)==-1)

sum+=1;

cout<<sum;

return 0;

}

int fun(int arr[],int k,int ar[])

{

if(arr[k]==-1 || arr[k]==-2) return -2;

if(ar[k]>0) return -1;

ar[k]+=1;

arr[k]=fun(arr,arr[k],ar);

return arr[k];

}

```
#include<iostream>
#include<vector>
#include<map>
using namespace std;
#define pb push_back
vector<int> v[1000];
int r[1000]={0};
int vis[1000]={0};
int dfs(int x)
{
// cout<<"x "<<x<<endl;
vis[x]=1;
if(!vis[v[x][0]])
{
//cout<<"here\n";
int z=dfs(v[x][0]);
if(z==-1)
return x;
else
{
r[x]=z;
return z;
}
}
else
{
//r[x]=r[i];
//cout<<"returning "<<x<<" = "<<r[v[x][0]]<<endl;
r[x]=r[v[x][0]];
return r[v[x][0]];
}
return -1;
}
int main()
{
int n;
cin>>n;
int i;
//v.resize(n+1);
for(i=1;i<=n;i++)
r[i]=i;
for(i=1;i<=n;i++)
{
int x;
cin>>x;
v[i].pb(x);
}
for(i=1;i<=n;i++)
{
if(!vis[i])
{
// cout<<"For i "<<i<<endl;
dfs(i);
}
}
/*
for(i=1;i<=n;i++)
cout<<r[i]<<" ";
cout<<endl;*/
int ch=1;
map<int,int> m;
for(i=2;i<=n;i++)
{
if(r[i]!=ch)
{
m[r[i]]++;
}
}
cout<<m.size()<<endl;
}
```

```
int good_graph(vector<int> nodes)
{
int len=nodes.size();
int *visited = new int[len];
int *good = new int[len];
memset(visited,0,sizeof visited);
good[0]=1;
visited[1];
int ans=0;
for(int i = 0; i < len; i++)
{
int p=nodes[i];
int t=i;
while(1)
{
if(p==0)
break;
if(good[p]==0 && visited[p]==1)
{
ans++;
good[p]=1;
break;
}
//good[p]=1;
visited[p]=1;
p=nodes[p];
nodes[t]=0;
t=p;
}
}
return ans;
}
int main()
{
int n;
cout<<"enter no. of nodes: ";
cin>>n;
//adj = new list<int>[n];
vector<int> nodes(n,0);
cout<<"enter for each node the direction of edge\n";
for(int i=0;i<n;i++)
{
cin>>nodes[i];
nodes[i]-=1;
}
int ans=good_graph(nodes);
cout<<"\nminimum change of pointers\n";
cout<<ans<<endl;
return 0;
```

}

```
int good_graph(vector<int> nodes)
{
int len=nodes.size();
int *visited = new int[len];
int *good = new int[len];
memset(visited,0,sizeof visited);
good[0]=1;
visited[1];
int ans=0;
for(int i = 0; i < len; i++)
{
int p=nodes[i];
int t=i;
while(1)
{
if(p==0)
break;
if(good[p]==0 && visited[p]==1)
{
ans++;
good[p]=1;
break;
}
//good[p]=1;
visited[p]=1;
p=nodes[p];
nodes[t]=0;
t=p;
}
}
return ans;
}
int main()
{
int n;
cout<<"enter no. of nodes: ";
cin>>n;
//adj = new list<int>[n];
vector<int> nodes(n,0);
cout<<"enter for each node the direction of edge\n";
for(int i=0;i<n;i++)
{
cin>>nodes[i];
nodes[i]-=1;
}
int ans=good_graph(nodes);
cout<<"\nminimum change of pointers\n";
cout<<ans<<endl;
return 0;
}
```

```
int good_graph(vector<int> nodes)
{int len=nodes.size();
int *visited = new int[len];
int *good = new int[len];
memset(visited,0,sizeof visited);
good[0]=1;
visited[1];
int ans=0;
for(int i = 0; i < len; i++)
{
int p=nodes[i];
int t=i;
while(1)
{
if(p==0)
break;
if(good[p]==0 && visited[p]==1)
{
ans++;
good[p]=1;
break;
}
//good[p]=1;
visited[p]=1;
p=nodes[p];
nodes[t]=0;
t=p;
}
}
return ans;
}
int main()
{
int n;
cout<<"enter no. of nodes: ";
cin>>n;
//adj = new list<int>[n];
vector<int> nodes(n,0);
cout<<"enter for each node the direction of edge\n";
for(int i=0;i<n;i++)
{
cin>>nodes[i];
nodes[i]-=1;
}
int ans=good_graph(nodes);
cout<<"\nminimum change of pointers\n";
cout<<ans<<endl;
return 0;
```

}

(1) Scan the list, for each node U if U points to V and V is not U, mark node U (there's an out going edge from U)

(2) Scan the list and count the number of node V that is not marked in (1) and that is not the tail.

This doesn't work because we don't need to modify every edge that initially has no path to node 1. It's possible to imagine a situation, for instance, where all nodes except node 1 form one giant cycle. If you now modify just any one node in the cycle to point to node 1, you will have a path for every node. The answer in that case would be 1 rather than N - 1.

Create a graph. First run strong connected component algorithm, in the SCCG count nodes without outgoing edges.

i think that we need strong connected components, because if we want to make set of nodes good, they must be interconnected. We have to choose such node , that will be connected to all other in set.

Consider the case where the graph looks like one big tree rooted at node 1, with an edge from each child to its parent. The answer is 0 because node 1 can be reached from all other nodes and there is therefore no need to change any of the edges. However, the entire tree is clearly not just one strongly connected component. You can see from this example that detecting SCCs doesn't give you the answer.

Most answers here over-complicate the problem to graph algorithms. An O(n) time solution with O(n) space complexity is given below. The solution associates timestamps with (threads of) nodes and uses these timestamps to determine whether the encountered node is undiscovered, good or bad. No actual nodes are created all the data is maintained in two arrays of size n.

```
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.HashSet;
public class GoodNodes {
public static void main(String [] argv) throws Exception {
int [] nodes = readNodes(new BufferedReader(new InputStreamReader(System.in)));
int changeCnt = getNumRedirects(nodes);
System.out.println("Result is " + changeCnt);
}
public static int getNumRedirects(int [] nodes){
int n = nodes.length;
if(n < 1) return 0;
int [] timestamps = new int[n];
int ts = 1;
HashSet<Integer> goodTimes = new HashSet<>();
timestamps[0] = ts;
int ret = 0 ;
for(int i = 1; i < n ; i++){
if(timestamps[i] <= 0) {
ts++;
int c = i;
while(timestamps[c] <=0){
timestamps[c] = ts;
c = nodes[c];
}
if(timestamps[c] < ts){
if(goodTimes.contains(timestamps[c]) ){
goodTimes.add(ts);
}
} else { // equals
ret++;
}
}
}
return ret;
}
private static int [] readNodes(BufferedReader ps) throws Exception {
int n = readInt(ps);
int [] ret = new int[n];
for(int i = 0; i < n ; i++){
ret[i] = readInt(ps)-1;
}
return ret;
}
private static int readInt(BufferedReader ps) throws Exception {
return Integer.parseInt(ps.readLine().trim());
}
}
```

1. We just need to detect cycles and trees, later we need just count them as in cycle we can change any node to point to first node and in tree we need to change only root node to pointing to first node, the alg quite similar to union find

```
public static void main(String[] args) {
int[] nodes = {0,2,3,4,0};
//int[] nodes = {5, 2, 3, 4, 1, 0, 0};
int[] clusters = new int[nodes.length];
for (int i = 0; i < clusters.length; i++) {
clusters[i] = -1;
}
for (int i = 0; i < clusters.length; i++) {
if (clusters[i] == -1) {
int s = mark(i, i, clusters, nodes);
if (s != i) {
for (int j = 0; j < clusters.length; j++) {
if (clusters[j] == i) {
clusters[j] = s;
}
}
}
}
}
//System.out.println(Arrays.toString(clusters));
int[] count = new int[nodes.length];
int sum = 0;
for (int cluster : clusters) {
if (count[cluster] == 0) {
count[cluster] = 1;
sum++;
}
}
//System.out.println(Arrays.toString(count));
if( clusters[0] == 0 ){
sum--;
}
System.out.println(sum);
}
public static int mark(int cluster, int i, int[] clusters, int[] nodes) {
if (clusters[i] == -1) {
clusters[i] = cluster;
if (nodes[i] != i) {
return mark(cluster, nodes[i], clusters, nodes);
} else {
return i;
}
} else {
return clusters[i];
}
```

}

1. Remove the edge (pointer) from 1.

- Julian January 29, 20142. Find connected components.

3. The answer is the number of connected components - 1 (i.e. all the connected components except for the one that contains 1).