3.6. Supported Atomic Operations¶
Numba provides access to some of the atomic operations supported in CUDA, in the
numba.cuda.atomic
class.
Those that are presently implemented are as follows:

class
numba.cuda.
atomic
Namespace for atomic operations

class
add
(ary, idx, val) Perform atomic ary[idx] += val. Supported on int32, float32, and float64 operands only.
Returns the old value at the index location as if it is loaded atomically.

class
compare_and_swap
(ary, old, val) Conditionally assign
val
to the first element of an 1D arrayary
if the current value matchesold
.Returns the current value as if it is loaded atomically.

class
max
(ary, idx, val) Perform atomic ary[idx] = max(ary[idx], val). NaN is treated as a missing value, so max(NaN, n) == max(n, NaN) == n. Note that this differs from Python and Numpy behaviour, where max(a, b) is always a when either a or b is a NaN.
Supported on int32, int64, uint32, uint64, float32, float64 operands only.
Returns the old value at the index location as if it is loaded atomically.

class
min
(ary, idx, val) Perform atomic ary[idx] = min(ary[idx], val). NaN is treated as a missing value, so min(NaN, n) == min(n, NaN) == n. Note that this differs from Python and Numpy behaviour, where min(a, b) is always a when either a or b is a NaN.
Supported on int32, int64, uint32, uint64, float32, float64 operands only.

class
3.6.1. Example¶
The following code demonstrates the use of numba.cuda.atomic.max
to
find the maximum value in an array. Note that this is not the most efficient way
of finding a maximum in this case, but that it serves as an example:
from numba import cuda
import numpy as np
@cuda.jit
def max_example(result, values):
"""Find the maximum value in values and store in result[0]"""
tid = cuda.threadIdx.x
bid = cuda.blockIdx.x
bdim = cuda.blockDim.x
i = (bid * bdim) + tid
cuda.atomic.max(result, 0, values[i])
arr = np.random.rand(16384)
result = np.zeros(1, dtype=np.float64)
max_example[256,64](result, arr)
print(result[0]) # Found using cuda.atomic.max
print(max(arr)) # Print max(arr) for comparison (should be equal!)
Multiple dimension arrays are supported by using a tuple of ints for the index:
@cuda.jit
def max_example_3d(result, values):
"""
Find the maximum value in values and store in result[0].
Both result and values are 3d arrays.
"""
i, j, k = cuda.grid(3)
# Atomically store to result[0,1,2] from values[i, j, k]
cuda.atomic.max(result, (0, 1, 2), values[i, j, k])
arr = np.random.rand(1000).reshape(10,10,10)
result = np.zeros((3, 3, 3), dtype=np.float64)
max_example_3d[(2, 2, 2), (5, 5, 5)](result, arr)
print(result[0, 1, 2], '==', np.max(arr))