3.6. Supported Atomic Operations¶
Numba provides access to some of the atomic operations supported in CUDA, in the
numba.cuda.atomic class.
Those that are presently implemented are as follows:
- 
class numba.cuda.atomic
- Namespace for atomic operations - 
class add(ary, idx, val)
- Perform atomic ary[idx] += val. Supported on int32, float32, and float64 operands only. - Returns the old value at the index location as if it is loaded atomically. 
 - 
class atomic.compare_and_swap(ary, old, val)
- Conditionally assign - valto the first element of an 1D array- aryif the current value matches- old.- Returns the current value as if it is loaded atomically. 
 - 
class atomic.max(ary, idx, val)
- Perform atomic ary[idx] = max(ary[idx], val). NaN is treated as a missing value, so max(NaN, n) == max(n, NaN) == n. Note that this differs from Python and Numpy behaviour, where max(a, b) is always a when either a or b is a NaN. - Supported on int32, int64, uint32, uint64, float32, float64 operands only. - Returns the old value at the index location as if it is loaded atomically. 
 - 
class atomic.min(ary, idx, val)
- Perform atomic ary[idx] = min(ary[idx], val). NaN is treated as a missing value, so min(NaN, n) == min(n, NaN) == n. Note that this differs from Python and Numpy behaviour, where min(a, b) is always a when either a or b is a NaN. - Supported on int32, int64, uint32, uint64, float32, float64 operands only. 
 
- 
class 
3.6.1. Example¶
The following code demonstrates the use of numba.cuda.atomic.max to
find the maximum value in an array. Note that this is not the most efficient way
of finding a maximum in this case, but that it serves as an example:
from numba import cuda
import numpy as np
@cuda.jit
def max_example(result, values):
    """Find the maximum value in values and store in result[0]"""
    tid = cuda.threadIdx.x
    bid = cuda.blockIdx.x
    bdim = cuda.blockDim.x
    i = (bid * bdim) + tid
    cuda.atomic.max(result, 0, values[i])
arr = np.random.rand(16384)
result = np.zeros(1, dtype=np.float64)
max_example[256,64](result, arr)
print(result[0]) # Found using cuda.atomic.max
print(max(arr))  # Print max(arr) for comparision (should be equal!)
Multiple dimension arrays are supported by using a tuple of ints for the index:
@cuda.jit
def max_example_3d(result, values):
    """
    Find the maximum value in values and store in result[0].
    Both result and values are 3d arrays.
    """
    i, j, k = cuda.grid(3)
    # Atomically store to result[0,1,2] from values[i, j, k]
    cuda.atomic.max(result, (0, 1, 2), values[i, j, k])
arr = np.random.rand(1000).reshape(10,10,10)
result = np.zeros((3, 3, 3), dtype=np.float64)
max_example_3d[(2, 2, 2), (5, 5, 5)](result, arr)
print(result[0, 1, 2], '==', np.max(arr))