3.1. Overview¶

Numba supports CUDA GPU programming by directly compiling a restricted subset of Python code into CUDA kernels and device functions following the CUDA execution model. Kernels written in Numba appear to have direct access to NumPy arrays. NumPy arrays are transferred between the CPU and the GPU automatically.

3.1.1. Terminology¶

Several important terms in the topic of CUDA programming are listed here:

host: the CPU
device: the GPU
host memory: the system main memory
device memory: onboard memory on a GPU card
kernels: a GPU function launched by the host and executed on the device
device function: a GPU function executed on the device which can only be called from the device (i.e. from a kernel or another device function)

3.1.2. Programming model¶

Most CUDA programming facilities exposed by Numba map directly to the CUDA C language offered by NVidia. Therefore, it is recommended you read the official CUDA C programming guide.

3.1.3. Requirements¶

3.1.3.1. Supported GPUs¶

Numba supports CUDA-enabled GPU with compute capability 2.0 or above with an up-to-data Nvidia driver.

3.1.3.2. Software¶

You will need the CUDA toolkit installed. If you are using Conda, just type:

$ conda install cudatoolkit

3.1.4. Missing CUDA Features¶

Numba does not implement all features of CUDA, yet. Some missing features are listed below:

dynamic parallelism
texture memory