I want to parallelise bilinear interpolation using OpenMP in such a way that there should be least memory access of input array. In the code below, for each iteration of i and j in output array, input data according to longitude and latitude values are read and processed.

input[20][20] - input array that contains data values eg{1,2,3,..,400} in 2d
lon[100][100] - longitudinal positions of each interpolation point in output array in horizontal axis not necessarily equidistant. eg. {2.34,2.65,2.74... }
lat[100][100] - latitudinal positions of each interpolation point in output array in vertical axis not necessarily equidistant.eg. {5.76,5.92,6.26... }
output[100][100] - array containing interpolated values

void interpolate(float (*lon)[100] ,float (*lat)[100] , float (*input)[100],float (*output)[100]) {
int i,j,floori,floorj;
float fractionj,fractioni;
for(j = 0; j < 100; j++)

for(i = 0; i < 100; i++)
floori = lon[i][j];
fractioni = lon[i][j] - floori;
floorj = lat[i][j];
fractionj = lat[i][j] - floorj;
output[i][j] = (1.0-fractioni)*(1.0-fractionj)*input[floori][floorj] + fractioni*(1.0-fractionj)*input[floori+1][floorj] + (1.0-fractioni)*fractionj*input[floori][floorj+1] + fractioni * fractionj *input[floori+1][floorj+1];

I need to divide the work in such a way that for all the interpolation points specified using lon and lat values within block of input[floori][floorj],input[floori+1][floorj],input[floori][floorj+1],input[floori+1][floorj+1] should go to one thread so that input values are read only once from memory to register for each thread.

pritish.naik.004 April 10, 2015

