NVIDIA Interview Question for Software Engineer / Developers






Comment hidden because of low score. Click to expand.
1
of 1 vote

Compute Unified Device Architecture is an easy way to use GPU for General Purpose Programming. No graphics knowledge is required to use CUDA for doing a program using GPU. A CUDA program is almost same as a C program but have some additional features. In CUDA a function can be run in many threads by giving a execution configuration while calling a function. There are 3 kinds of functions in CUDA. A device function which can be only executed in the device and called from device, a global function which can be called from host(CPU) using some configuration and gets executed in device, and a pure host function which must be executed in the CPU only.

There are some additional specifiers to distinguish the function type.

__device__ - if a function is suffixed with __device__ it becomes a device function which can only be executed at device and which can only be called from a function that executes on device.
__host__ - These functions are the normal C functions that can be executed on the host(CPU)
__global__ - A function suffixed with __global__ can be called from CPU. But for calling this function the execution configuration must be mentioned. The execution configuration decides how many threads and blocks have to be made for executing this function.

Also there are different kinds of memory,

__shared__ - if this is prefixed that memory becomes a shared memory and it can be shared across threads. This is the fastest memory.
__constant__ - a memory to which we can write from host only.
__device__ - a memory to which we can write from both device and host.


Each thread will have a thread ID and block ID to know which area of the data need to be processed by this thread. This is the tricky area where all the performance improvement lies.

- sumit saxena March 08, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

I think by "device function", it refers to CUDA kernel. :|

- Altruist November 25, 2009 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Aren't read & write device functions ?

- Anonymous January 20, 2010 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

read(), write(), lseek() are device functions only. As the Char Devices are registers to VFS.

- prasad September 05, 2010 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More