nathaliekaligirwa
BAN USEREducation
M.Sc. Computer Science. University of Oklahoma [3.83/4] Dec. 2014
B.S. Computer Science. Oklahoma Christian University [3.84/4] Apr. 2011
Technical Skills
Languages: C++, C#, NVIDIA Cuda, Java, Python, PHP/MySQL, HTML/Javascript/CSS
IDE: Eclipse, Visual Studio 20032010, and Xcode
Publications
“Parallel QuadTree Encoding of LargeScale Raster Geospatial Data on Multicore CPUs and GPGPUs” by Nathalie Kaligirwa,Eleazar Leal, Le Gruenwald, Jianting Zhang and Simin You.
Relevant Courses
Artificial Intelligence, Algorithm Analysis, Data Networks, Datastructures, and Advanced Database Management
Professional Experience
Research AssistantOU Computer Science [Aug. 2013 – Current]
• Developing parallel algorithms with CUDA to process largescale geospatial data.
• Study and compare stateofthe art image processing and compression techniques.
• Currently writing a research paper presenting the results of using different techniques for compressing largescale geospatial images.
ProgrammerNextThought, LLC [Mar. 2012Dec. 2012]
• Created test suites for the web interface of NextThought’s social learning platform using Python, Java, and Selenium tools.
• Identified and decided testing strategies for newly implemented or unstable features while taking into account time and resources constraints.
• Communicated closely with software developers to solve application failures.
Red Earth Systems Programmer [June 2011Dec.2011]
• Incorporated new reports in four web billing applications using SQL servers, SQL Reporting Services, and linked servers capabilities.
• Migrated sensitive billing data for Chesapeake Energy to ensure smooth transitioning to a new billing application.
Rwandan Girls Empowerment Volunteer [Dec. 2010Current]
• Cofounded this organization to help young girls in Rwanda, specifically by connecting them to American sponsors who provide school fees.
• Expanded the number of recipients by 50% within a year.
• Organized staff meetings, maintained the organization’s website, and assisted with fundraising.
Affiliations/Memberships
• Member of the ACM for Women
• Member of Alpha Chi Honors’ Society
Oh I see, I assumed a pure diagonal solution, sorry about that. I will look into this solution. However, one way I can think of is to put a condition such as: don't let the yaxis increment go over 4, and once it is over 4, the xaxis increment should not change unless we are at the end in which case it decrements and don't let the number of items per row go over 12.
 nathaliekaligirwa October 26, 2014Language: C++
Strategy: use string append of cstrings to copy chunks of original string. Use modulo to figure out the number of digits before and after '.' notation. Use division to figure out which unit to use (B,K,M, or G).
Assumptions[room for improvement]:
 any character can be part of the string, not just numbers.
 no space in input string.
 each unit occupies only one byte.
#include <cstdlib>
#include <iostream>
#include <cmath>
#define SIZE 4
#define UPPERLIMIT 10 //we can only process up to 10 digits
#define NDIGITS 3 //number of digits that are readable, in our case only 3
char * readable (char * string);
int main (int argc, char ** argv){
char input []= "1000000000B"; //"341B" //"12345B" /
printf ("%s\n", readable(input));
}
char * readable (char * string) {
char order [] = "BKMG"; //the order
char * newString = new char (SIZE);
int size = strlen(string)1;
if (string == NULL) {
printf ("string is empty");
exit (0);
}
if (size > UPPERLIMIT) {
printf ("number is too high for processing\n");
exit(0);
}
if (size <= NDIGITS) {
//return the original string
strcpy (newString, string);
}
else {
//the division will give us meaasurement: B, K, M, or G
//the modulo will give us how many numbers before the .
//then add numbers after . to fill limit of length = 3
strncpy (newString, string, size%NDIGITS);
strcat (newString, ".");
//start reading from where we left off, and add as many characters as needed
//to have a max of NDIGITS (3 in our case)
strncat (newString, string+(size%NDIGITS), NDIGITS(size%NDIGITS));
//append, B, K, M, or G
newString[strlen(newString)] = order[size/NDIGITS];
}
return newString;
}

nathaliekaligirwa
October 22, 2014  Language: C++
 Strategy:
assumptions: our array items follow an arithmetic progression with offset = 1.
Sequence #1:
We notice that each displayed line is a sequence of this type ("labeled xaxis sequence"):
s is the first term on the line
k is a value that will increment from line to line
s, s+k+1,s+k+1+2, s+k+1+2+3, s+k+1+2+3+4, .....up to s+k+1+2+...+m
such that s+ k+1+2+3+4...+m is the last term <= n (n is the last term in the original array)
Now that we can display one line, how do we increment value s and value k from line to line?
We notice that the first term for each new line is part of a sequence of this type (labeled "y axis sequence"):
s is the starting element of the array
s, s+2, s+2+3, s+2+3+4, s+2+3+4...+m
where s+2+3+4...+m is last term <= n (n is the last item of the array).
Also k, is incremented from line to line.
So after noticing these two sequences.
now it is a matter of incrementing the starting number of each line using "xaxis sequence" until we hit a value that is > n; in which case you start a new line with a new starting element calculated using the "columwise sequence" above and incrementing k.
Example with dataset [0...8]
starting s = 1;
starting k = 0;
first line
s, s+k+1, s+k+1+2, s+k+1+2+3 => 1, 1+0+1, 1+0+1+2, 1+0+1+2+3 => 1, 2, 4, 7
next element in sequence above would be > 8, go to next line and set new s = s+2 = 3, k=1
s, s+k+1, s+k+1+2 => 3, 3+1+1, 3+1+1+2 => 3, 5, 8
the next element in sequence above = 8, display it and set new s = (s+2)+3 = 6, k = 2
s => 6
next element in the sequence above is 6+2+1 which is >8, so calculate new s= (s+2+3)+4
and the new s turns out to be > 8, which indicates that any element on this line after this value is > 8.
The code below assumes an array with consecutive integers, but it can easily be used on any type of array if instead of manipulating values, we manipulate their indexes.
Code:
#include <cstdlib>
#include <iostream>
#define N 45 //8 //45
using namespace std;
void diagonal(int start, int end);
int main (int argc, char ** argv) {
//generate data
int size = N;
//call function to display in diagonal
diagonal(1, size);
}
void diagonal (int beginning, int end){
int number = beginning;
int start = number;
int xIncrement = 1;
int previousXIncrement = 1;
int yIncrement = 2;
while (number <= end) {
printf ("%u ", number);
if ((number + xIncrement) > end) {
//go to next line
//reset some parameters
printf ("\n");
//the starting number of the current row is the new start
//the increment along the yaxis is updated
start = start + yIncrement;
number = start;
yIncrement++;
//now we increment starting from the previous original increment but + 1
//the increment along xaxis is updated too
xIncrement = previousXIncrement + 1;
previousXIncrement++ ;
}
else {
number = number + xIncrement;
xIncrement++;
}
}
}

nathaliekaligirwa
October 22, 2014 Language: C++
Overview: Extract a token, append to new string, append space, and move on to next token until the end of the original string
Shortcomings: More memory is used because, firstly, we are copying to a new string instead of the original string. Secondly, we are keeping a temp local copy of the string in the 'reverse' function to avoid a modification of the original string by strtok.
#include <iostream>
#include <cstring>
#include <cstdlib>
void reverse (char * string, char * newstring) {
char * oldstring = new char (strlen(string));
strcpy (oldstring, string);
char * token = strtok (oldstring, "%20");
while (token !=NULL) {
strcat(newstring, token);
strcat(newstring," ");
token = strtok(NULL, "%20");
}
}
int main (int argc, char * * argv) {
char string [] = "This%20is%20a20%string";
int size = strlen (string);
//printf ("size:%u\n", size);
char * newstring = new char (size);
reverse (string, newstring);
printf ("Original string: %s\n", string);
printf ("New string: %s\n", newstring);
}
}
 nathaliekaligirwa October 17, 2014Language: CUDA/C++
Device specs:
 Tesla C2050
 448 CUDA cores
 49152 bytes of shared memory per block
 3072 MB of global memory
 1024 max threads per block
This implementation is a twopass reduction kernel as explained in the CUDA Handbook by Jonathan Wilt, there is room for improvement.
Basically, each block performs a summation on the subarray it is allocated. Then, the sums from all the blocks are added by calling the same kernel again.
For an allocation with blocks of 16 threads, these are the reported times:
64 x1 array:0.055840 ms.
256x1 array:0.05616 ms
512x1 array:0.056480 ms
1024x1 array:0.056704 ms
4096x1 array:0.076640 ms
8192x1 array:0.098848 ms.
For an allocation with blocks of 256 threads, these are the reported times:
64x1 array:0.135904 ms
256x1 array:0.064192 ms
512x1 array:0.064064 ms
1024x1 array:0.064512 ms
4096x1 array:0.064928 ms
8192x1 array:0.067456 ms
CUDA Code:
#include <cuda.h>
#include <iostream>
#include <stdlib.h>
#include <cstdio>
#include <sys/time.h>
#include <omp.h>
#define BLOCK_SIZE 256
#define ARRAY_SIZE 4096
//to handle API errors
static void HandleError( cudaError_t err,
const char *file,
int line )
{
if (err != cudaSuccess) {
printf( "%s in %s at line %d\n", cudaGetErrorString( err ),
file, line );
exit( EXIT_FAILURE );
}
}
#define HANDLE_ERROR( err ) (HandleError( err, __FILE__, __LINE__ ))
//for timing the time to perform the summation
cudaEvent_t event_gpu_start,event_gpu_end;
float gpu (int * numbers_d, int * blocksSums_d, int * answers_d, int size, int numBlocks, int blockSize);
__global__ void reduction_1 (int * numbers, int * blockSums, int blocks);
int main (int argc, char ** argv){
//array to reduce
//CPU data
int * numbers = new int [ARRAY_SIZE];
int * answer = new int [1];
//arrays to hold GPU data
int * answer_d;
int * numbers_d; //copy of data on device
int * blockSums_d; //holds blocks sums using during the second pass
//allocate the array to be summed. assign ints from 0 to size of array
for (int i=0; i < ARRAY_SIZE; i++) {
numbers[i] = i;
}
//allocate gpu memory space and copy the array to be summed
HANDLE_ERROR(cudaMalloc((void**)&answer_d,sizeof(int)));
HANDLE_ERROR(cudaMalloc((void **)&numbers_d,ARRAY_SIZE*sizeof(int)));
HANDLE_ERROR(cudaMemcpy(numbers_d, numbers, ARRAY_SIZE*sizeof(int),cudaMemcpyHostToDevice));
HANDLE_ERROR(cudaMalloc((void **)&blockSums_d,(ARRAY_SIZE/BLOCK_SIZE)*sizeof(int)));
//display message
printf ("[INFO]:data allocation done\n");
printf ("[INFO]:reduction started\n");
//call function to initialize and launch threads
float gpu_time = gpu (numbers_d, blockSums_d, answer_d, ARRAY_SIZE, ARRAY_SIZE/BLOCK_SIZE,BLOCK_SIZE);
printf ("[INFO]:reduction done\n");
printf ("[INFO]:time to reduce array %ux%u of integers:%3.6f ms.\n[INFO]:with %u blocks of %u threads\n"
,ARRAY_SIZE,1,gpu_time,ARRAY_SIZE/BLOCK_SIZE,BLOCK_SIZE);
HANDLE_ERROR(cudaMemcpy(answer,answer_d, sizeof(int), cudaMemcpyDeviceToHost));
printf ("[INFO]:the sum is %u\n", answer[0]);
HANDLE_ERROR(cudaFree(numbers_d));
HANDLE_ERROR(cudaFree(blockSums_d));
HANDLE_ERROR(cudaFree(answer_d));
}
float gpu (int * numbers_d, int * blockSums_d, int * answer_d, int size, int numBlocks, int blockSize) {
float gpu_time = 0.00;
//start timing the running time of the kernels
HANDLE_ERROR(cudaEventCreate(&event_gpu_start));
HANDLE_ERROR(cudaEventRecord(event_gpu_start));
unsigned int sharedMemorySize = blockSize*sizeof(int);
//first pass: calculate sum of subarrays
reduction_1 <<<numBlocks,blockSize,sharedMemorySize>>>(numbers_d,blockSums_d, size);
cudaDeviceSynchronize();
//second pass: calculate sum of all the sums of subarrays
reduction_1<<<1,blockSize,sharedMemorySize>>>(blockSums_d,answer_d,numBlocks);
HANDLE_ERROR(cudaEventCreate(&event_gpu_end));
HANDLE_ERROR(cudaEventRecord(event_gpu_end));
//calculate the time elapsed
cudaEventSynchronize(event_gpu_end);
cudaEventElapsedTime(&gpu_time, event_gpu_start, event_gpu_end);
return gpu_time;
}
__global__ void reduction_1 (int * numbers, int * blockSums, int size) {
int sum = 0;
extern __shared__ int interim[]; //dynamic shared memory
//for each thread add together whatever it reads
for (int i = blockIdx.x*blockDim.x + threadIdx.x; i < size; i+=blockDim.x*gridDim.x) {
sum+=numbers[i];
}
//store results in a shared array within a block
interim[threadIdx.x]= sum;
//make sure all threads reach this point
__syncthreads();
//now we have reduce the number of active threads per block
//for an array size = 64 and blocksize = 16;
//the first phase we reassign the same value in sum
//for the second pass
//we will use half of the threads to add together partial sums
//so we start with threads = 16/2=8 that add together x and x+8
//then 8/2 = 4 which adds x and x+4
//then 4/2 = 2 which adds x and x+2
//then 2/2 = 1 which adds x and x+1
for (int threads = blockDim.x>>1; threads; threads>>=1) {
if (threadIdx.x < threads) {
interim[threadIdx.x] +=interim[threads+threadIdx.x];
}
__syncthreads();
}
//the first thread of each block writes the sum of the subarray in an output array
if (threadIdx.x == 0) {
blockSums[blockIdx.x] = interim[threadIdx.x];
}
}

nathaliekaligirwa
October 14, 2014 Language: C++
Method:
 Get the length of the cstring
 Based on the length, we can figure out the base to start with.
For example '123', gives a size of 3, therefore the starting base will be pow(10,3);
 In the forloop, we decrease base as we go down the char array
 Multiply each (ascii code  ascii code ('0')) with the base;
 Add the result to the variable temp (that keeps incrementing as we add values);
#include <iostream>
#include <stdlib.h>
#include <string.h>
#include <cmath>
#define SYMBOL 48
int atoi (char * numbers);
int main (int argc, char ** argv) {
//initialize the string
char numbers[] = "0123456";
//atoi(numbers);
printf ("%d \n",atoi(numbers));
}
int atoi (char * numbers) {
printf ("size of string:%lu\n", strlen(numbers));
int temp = 0;
int base = pow(10,strlen(numbers)1);
//printable characters from 0 to 9 have an ascii code from 48 to 57
//just substracting 48 from the ascii code should give the correct value
//to keep things simple, we check if the ascii value is within the range we are handling
for (int i=0; i < strlen(numbers);i++) {
if (numbers[i] < 48  numbers[i] >57) {
printf ("the string contains illegal symbols, limit characters to range 0 to 9\n");
return 0;
}
temp += ((int)(numbers[i])SYMBOL)*base;
printf ("temp: %u\n", temp);
base = base/10;
}
return temp;
}

nathaliekaligirwa
October 14, 2014 First attempt with original BST initialization and a display function to check accuracy after mirorring.
The code does not create a new BST yet.
#include <stdlib.h>
#include <iostream>
#include <stdio.h>
struct Node;
Node * mirrorTree(Node * root);
//*****Utils Functions declarations****
Node * newNode (int value); //create a new node
int max (int left_hand, int right_hand); //return the max value between two ints
void testDisplayTree (Node * root); //display the whole tree
void testDisplayNode (Node * node); //display a node and its two children
//**************************************
struct Node {
int value;
Node * left;
Node * right;
};
int main (int argc, char ** argv) {
//create the binary tree
/*
Tree
1
/ \
2 3
/\ /\
4 5 6 7
\
9
*/
Node * root = new Node;
root>value = 1;
root>left = newNode(2);
root>right = newNode(3);
root>left>left = newNode(4);
root>left>right = newNode(5);
root>left>left>right = newNode(9);
root>right>left = newNode(6);
root>right>right = newNode(7);
//test the layout of tree
testDisplayTree (root);
mirrorTree(root);
printf ("After Mirror\n");
//check if values where swapped
testDisplayTree (root);
}
Node * mirrorTree(Node * root) {
if (root == NULL) {
return NULL;
}
if (root>left == NULL && root>right == NULL) {
return root;
}
Node * left = mirrorTree (root>left);
Node * right = mirrorTree (root>right);
root>left = right;
root>right = left;
return root;
}
/********Utils functions *******/
//add a new node
Node * newNode (int value) {
Node * node = new Node;
node>value = value;
node>left = NULL;
node>right = NULL;
return node;
}
//return the greatest int
int max (int left_hand, int right_hand) {
if (left_hand >= right_hand)
return left_hand;
else
return right_hand;
}
//display a node and its children
void testDisplayNode(Node * node) {
printf ("node with value: %u, left child value: %d, right child value:%d\n",
node>value,
node>left !=NULL?node>left>value:1,
node>right!=NULL?node>right>value:1);
}
//display the whole tree
void testDisplayTree (Node * root) {
if (root == NULL) {
return;
}
testDisplayNode(root);
testDisplayTree (root>left);
testDisplayTree (root>right);
}

nathaliekaligirwa
October 14, 2014 This code calculates the diameter of a binary tree using recursion.
Language: C++
Node structure :<int, Node* left, Node * right>
I wrote this code after reading this very good explanation of the diameter of a tree here:geeksforgeeks.org/diameterofabinarytree/
#include <stdlib.h>
#include <iostream>
#include <stdio.h>
struct Node {
int value;
Node * left;
Node * right;
};
int maxPath (Node * root, int &longestPath);
//*****Utils Functions declarations****
Node * newNode (int value); //create a new node
int max (int left_hand, int right_hand); //return the max value between two ints
void testDisplayTree (Node * root); //display the whole tree
void testDisplayNode (Node * node); //display a node and its two children
//**************************************
int main (int argc, char ** argv) {
//create the binary tree
/*
Tree
1
/ \
2 3
/\ /\
4 5 6 7
\
9
*/
Node * root = new Node;
root>value = 1;
root>left = newNode(2);
root>right = newNode(3);
root>left>left = newNode(4);
root>left>right = newNode(5);
//comment out the next three lines to get the exact same tree as what you proposed above
root>left>left>right = newNode(9);
root>right>left = newNode(6);
root>right>right = newNode(7);
//test the layout of tree
//testDisplayTree (root);
//in the beginning the longest path is = 0
int longestPath = 0;
int result=maxPath (root,longestPath);
printf ("diameter:%u\n", longestPath);
}
int maxPath (Node * root, int &longestPath) {
//if we reach a leaf value, return
if (root == NULL) {
//printf ("tree is empty\n");
return 0;
}
if (root>left == NULL && root>right == NULL) {
return 0;
}
//else start calculating the different paths
int leftFurthest = 0; //value with the longest path size in left subtree
int rightFurthest = 0; //value with the longest path size in right subtree
leftFurthest = 1+maxPath (root>left, longestPath);
rightFurthest = 1+maxPath (root>right, longestPath);
//after both left and right subtrees return for the current root
//find out if we have a new longest path, if not just keep the value we have so far
//which is passed by reference (there might be a better way to do this but for now this will do)
//how do we choose a new longest path to compare with the one we have so far?
//>path to the furthest leaf node in the left subtree + path to the furthest leaf node in the right subtree
longestPath = max (longestPath, leftFurthest + rightFurthest +1);
//return the longer path from the two furthest points in subtree
return max(leftFurthest, rightFurthest) ;
}
/********Utils functions *******/
//add a new node
Node * newNode (int value) {
Node * node = new Node;
node>value = value;
node>left = NULL;
node>right = NULL;
return node;
}
//return the greatest int
int max (int left_hand, int right_hand) {
if (left_hand >= right_hand)
return left_hand;
else
return right_hand;
}
//display a node and its children
void testDisplayNode(Node * node) {
printf ("node with value: %u, left child value: %d, right child value:%d\n",
node>value,
node>left !=NULL?node>left>value:1,
node>right!=NULL?node>right>value:1);
}
//display the whole tree
void testDisplayTree (Node * root) {
if (root == NULL) {
return;
}
testDisplayNode(root);
testDisplayTree (root>left);
testDisplayTree (root>right);
}

nathaliekaligirwa
October 14, 2014 Language: C++
Implementation:Iterative
Linked List Node: <int value, Node * next>
#include <string.h>
#include <sstream>
#include <stdlib.h>
#include <cstdlib>
struct Node {
Node * next;
int value;
};
void test (Node * root);
void reverse(Node * root);
int main (int argc, char ** argv) {
//new linked list
Node * root = new Node;
root>next = NULL;
Node * current = new Node;
Node * node;
//step 1:
//set the root of the lis
current>value = 0;
current>next = NULL;
//set the beginning of the list
root>next = current;
//step 2: fill the array
for (int i=1; i<5; i++) {
node = new Node;
node>value = i;
node>next = NULL;
current>next = node;
current = current>next;
}
//display as test
test(root);
//reverse linked list
//Step 3: Reverse the list
reverse (root);
test(root);
}
//Function template @
//geeksforgeeks.org/writeafunctiontoreversethenodesofalinkedlist/
void reverse (Node * root ) {
Node * current; //node that we need to update
Node * next; //node that 'current' points to before reverting
Node * previous; //node that 'current' needs to point to after reverting
//get first element
current = root>next;
//make sure this first element is set up to point to NULL (by making 'previous' = NULL) since it will be the last item of the array
previous = NULL;
while (current !=NULL) {
next = current>next;
current>next = previous;
previous = current;
current = next;
}
//make sure the root is pointing to the last element
//and not 'current' because at this point current = NULL
root>next = previous;
}
void test (Node * root) {
Node * node;
node = root>next;
while (node !=NULL) {
printf ("%u \n", node>value);
node = node >next;
}
}

nathaliekaligirwa
October 14, 2014 Quick implementation with C++ using cstrings on a machine that does not support itoa.
Use strtok twice: first to divide the input with delimiter "," and then with delimiter "..".
{
#include <string.h>
#include <sstream>
#include <stdlib.h>
#include <cstdlib>
using namespace std;
void unroll (char * string, char * newstring);
int main(int argc, char ** argv) {
char string [] = "1..5,8,11..14,18,20,26..29";
char * newstring = new char [strlen(string)*100];
printf ("%s \n", string);
unroll (string, newstring);
printf ("%s \n", newstring);
}
void unroll (char *string, char * newstring) {
//edge cases
if (string == NULL) {
printf ("string is empty");
return;
}
if (strlen(string) == 1) {
newstring[0] = string[0];
return;
}
//divide into tokens and store them in array chunks
char * token;
char * temptoken;
char * nbr1 = new char [5];
char * nbr2 = new char [5];
int count = 0;
char * pos = new char [5];
token = strtok (string, ",");
char * * chunks = new char * [20];
while (token != NULL) {
chunks [count] = new char [strlen(token)];
strcpy (chunks[count], token);
token = strtok (NULL, ",");
count++;
}
//read each item in chunk and extract the numbers
int items = 0;
char tempnbr [3];
for (int i=0; i < count; i++) {
nbr1 = strtok(chunks[i],"..");
nbr2 = strtok(NULL, "..");
strcat (newstring,nbr1);
if (nbr2 != NULL) {
for (int j = atoi(nbr1)+1; j <=atoi(nbr2);j++) {
strcat (newstring, ",");
sprintf (tempnbr, "%d", j);
strcat (newstring, tempnbr);
}
}
//to make sure the semicolon doesn't appear at the end
if (i != count1){
strcat (newstring, ",");
}
}
}}
 nathaliekaligirwa October 14, 2014Open Chat in New Window
OK, after much thought the easiest answer would be to have an unrestricted number of columns (same solution as PavPS proposed).
 nathaliekaligirwa October 27, 2014If the number of columns is restricted:
Approach #1:
 if the number of columns > max number of columns and if the value that was supposed to go on that column is < last element of array, pass down the value to the next column.
That means if we are on the last column or the next value > last element in array, we have to check if there is a value that was passed down. If there is, display it and check again if you need to pass down another value.
#Approach two:
Store values in arrays, each one representing a row and then display them one by one (space complexity is terrible though).
The implementation might be trickier for the first approach but this is the best I could come up with.