numba.cuda.to_device(obj, stream=0, copy=True, to=None)¶Allocate and transfer a numpy ndarray or structured scalar to the device.
To copy host->device a numpy array:
ary = np.arange(10)
d_ary = cuda.to_device(ary)
To enqueue the transfer to a stream:
stream = cuda.stream()
d_ary = cuda.to_device(ary, stream=stream)
The resulting d_ary is a DeviceNDArray.
To copy device->host:
hary = d_ary.copy_to_host()
To copy device->host to an existing array:
ary = np.empty(shape=d_ary.shape, dtype=d_ary.dtype)
d_ary.copy_to_host(ary)
To enqueue the transfer to a stream:
hary = d_ary.copy_to_host(stream=stream)
numba.cuda.device_array(shape, dtype=np.float, strides=None, order='C', stream=0)¶Allocate an empty device ndarray. Similar to numpy.empty().
numba.cuda.device_array_like(ary, stream=0)¶Call cuda.devicearray() with information from the array.
numba.cuda.pinned_array(shape, dtype=np.float, strides=None, order='C')¶Allocate a np.ndarray with a buffer that is pinned (pagelocked). Similar to np.empty().
numba.cuda.mapped_array(shape, dtype=np.float, strides=None, order='C', stream=0, portable=False, wc=False)¶Allocate a mapped ndarray with a buffer that is pinned and mapped on to the device. Similar to np.empty()
| Parameters: |
|
|---|
numba.cuda.pinned(*arylist)¶A context manager for temporary pinning a sequence of host ndarrays.
numba.cuda.mapped(*arylist, **kws)¶A context manager for temporarily mapping a sequence of host ndarrays.
numba.cuda.cudadrv.devicearray.DeviceNDArray(shape, strides, dtype, stream=0, writeback=None, gpu_data=None)¶An on-GPU array type
copy_to_device(ary, stream=0)¶Copy ary to self.
If ary is a CUDA memory, perform a device-to-device transfer. Otherwise, perform a a host-to-device transfer.
copy_to_host(ary=None, stream=0)¶Copy self to ary or create a new Numpy ndarray
if ary is None.
If a CUDA stream is given, then the transfer will be made
asynchronously as part as the given stream. Otherwise, the transfer is
synchronous: the function returns after the copy is finished.
Always returns the host array.
Example:
import numpy as np
from numba import cuda
arr = np.arange(1000)
d_arr = cuda.to_device(arr)
my_kernel[100, 100](d_arr)
result_array = d_arr.copy_to_host()
is_c_contiguous()¶Return true if the array is C-contiguous.
is_f_contiguous()¶Return true if the array is Fortran-contiguous.
ravel(order='C', stream=0)¶Flatten the array without changing its contents, similar to
numpy.ndarray.ravel().
reshape(*newshape, **kws)¶Reshape the array without changing its contents, similarly to
numpy.ndarray.reshape(). Example:
d_arr = d_arr.reshape(20, 50, order='F')
split(section, stream=0)¶Split the array into equal partition of the section size. If the array cannot be equally divided, the last section will be smaller.
numba.cuda.cudadrv.devicearray.DeviceRecord(dtype, stream=0, gpu_data=None)¶An on-GPU record type
copy_to_device(ary, stream=0)¶Copy ary to self.
If ary is a CUDA memory, perform a device-to-device transfer. Otherwise, perform a a host-to-device transfer.
copy_to_host(ary=None, stream=0)¶Copy self to ary or create a new Numpy ndarray
if ary is None.
If a CUDA stream is given, then the transfer will be made
asynchronously as part as the given stream. Otherwise, the transfer is
synchronous: the function returns after the copy is finished.
Always returns the host array.
Example:
import numpy as np
from numba import cuda
arr = np.arange(1000)
d_arr = cuda.to_device(arr)
my_kernel[100, 100](d_arr)
result_array = d_arr.copy_to_host()
numba.cuda.cudadrv.devicearray.MappedNDArray(shape, strides, dtype, stream=0, writeback=None, gpu_data=None)¶A host array that uses CUDA mapped memory.
copy_to_device(ary, stream=0)¶Copy ary to self.
If ary is a CUDA memory, perform a device-to-device transfer. Otherwise, perform a a host-to-device transfer.
copy_to_host(ary=None, stream=0)¶Copy self to ary or create a new Numpy ndarray
if ary is None.
If a CUDA stream is given, then the transfer will be made
asynchronously as part as the given stream. Otherwise, the transfer is
synchronous: the function returns after the copy is finished.
Always returns the host array.
Example:
import numpy as np
from numba import cuda
arr = np.arange(1000)
d_arr = cuda.to_device(arr)
my_kernel[100, 100](d_arr)
result_array = d_arr.copy_to_host()
split(section, stream=0)¶Split the array into equal partition of the section size. If the array cannot be equally divided, the last section will be smaller.