7.2.1. NBEP 2: Extension points

Author:Antoine Pitrou
Date:July 2015
Status:Draft

Implementing new types or functions in Numba requires hooking into various mechanisms along the compilation chain (and potentially outside of it). This document aims, first, at examining the current ways of doing so and, second, at making proposals to make extending easier.

If some of the proposals are implemented, we should first strive to use and exercise them internally, before exposing the APIs to the public.

Note

This document doesn’t cover CUDA or any other non-CPU backend.

7.2.1.1. High-level API

There is currently no high-level API for quick implementation of an existing function or type.

7.2.1.1.1. Proposed changes

It would be nice for people to be able to implement a function in a single go, as if they were writing a @jit function. As an example, let’s assume we want to make numpy.where() usable from nopython mode. We would like to be able to define several implementations and select between them at compile-time depending on the input types.

The following example showcases a hypothetical API where we can register a function taking the argument types and returning a callable implementing the actual function for those types. The API should also be able to handle optional arguments, and the resulting implementation should support calling with named parameters.

import numpy as np

from numba import types
from numba.extending import overlay

@overlay(np.where)
def where(cond, x, y):
    """
    Implement np.where().
    """
    # Choose implementation based on argument types.
    if isinstance(cond, types.Array):
        # Array where() => return an array of the same shape
        if all(ty.layout == 'C' for ty in (cond, x, y)):
            def where_impl(cond, x, y):
                """
                Fast implementation for C-contiguous arrays
                """
                shape = cond.shape
                if x.shape != shape or y.shape != shape:
                    raise ValueError("all inputs should have the same shape")
                res = np.empty_like(x)
                cf = cond.flat
                xf = x.flat
                yf = y.flat
                rf = res.flat
                for i in range(cond.size):
                    rf[i] = xf[i] if cf[i] else yf[i]
                return res
        else:
            def where_impl(cond, x, y):
                """
                Generic implementation for other arrays
                """
                shape = cond.shape
                if x.shape != shape or y.shape != shape:
                    raise ValueError("all inputs should have the same shape")
                res = np.empty_like(x)
                for idx, c in np.ndenumerate(cond):
                    res[idx] = x[idx] if c else y[idx]
                return res

    else:
        def where_impl(cond, x, y):
            """
            Scalar where() => return a 0-dim array
            """
            scal = x if cond else y
            return np.full_like(scal, scal)

    return where_impl

7.2.1.2. Typing

7.2.1.2.1. Numba types

Numba’s standard types are declared in numba.types. To declare a new type, one subclasses the base Type class or one of its existing abstract subclasses, and implements the required functionality.

7.2.1.2.1.1. Proposed changes

No change required.

7.2.1.2.2. Type inference on values

Values of a new type need to be type-inferred if they can appear as function arguments or constants. The core machinery is in numba.typing.typeof.

In the common case where some Python class or classes map exclusively to the new type, one can extend a generic function to dispatch on said classes, e.g.:

from numba.typing.typeof import typeof_impl

@typeof_impl(MyClass)
def _typeof_myclass(val, c):
   if "some condition":
      return MyType(...)

The typeof_impl specialization must return a Numba type instance, or None if the value failed typing.

In the rarer case where the new type can denote various Python classes that are impossible to enumerate, one must insert a manual check in the fallback implementation of the typeof_impl generic function.

7.2.1.2.2.1. Proposed changes

Allow people to define a generic hook without monkeypatching the fallback implementation.

7.2.1.2.3. Fast path for type inference on function arguments

Optionally, one may want to allow a new type to participate in the fast type resolution (written in C code) to minimize function call overhead when a JIT-compiled function is called with the new type. One must then insert the required checks and implementation in the _typeof.c file, presumably inside the compute_fingerprint() function.

7.2.1.2.3.1. Proposed changes

None. Adding generic hooks to C code embedded in a C Python extension is too delicate a change.

7.2.1.2.4. Type inference on operations

Values resulting from various operations (function calls, operators, etc.) are typed using a set of helpers called “templates”. One can define a new template by subclass one of the existing base classes and implement the desired inference mechanism. The template is explicitly registered with the type inference machinery using a decorator.

The ConcreteTemplate base class allows one to define inference as a set of supported signatures for a given operation. The following example types the modulo operator:

@builtin
class BinOpMod(ConcreteTemplate):
    key = "%"
    cases = [signature(op, op, op)
             for op in sorted(types.signed_domain)]
    cases += [signature(op, op, op)
              for op in sorted(types.unsigned_domain)]
    cases += [signature(op, op, op) for op in sorted(types.real_domain)]

(note that type instances are used in the signatures, severely limiting the amount of genericity that can be expressed)

The AbstractTemplate base class allows to define inference programmatically, giving it full flexibility. Here is a simplistic example of how tuple indexing (i.e. the __getitem__ operator) can be expressed:

@builtin
class GetItemUniTuple(AbstractTemplate):
    key = "getitem"

    def generic(self, args, kws):
        tup, idx = args
        if isinstance(tup, types.UniTuple) and isinstance(idx, types.Integer):
            return signature(tup.dtype, tup, idx)

The AttributeTemplate base class allows to type the attributes and methods of a given type. Here is an example, typing the .real and .imag attributes of complex numbers:

@builtin_attr
class ComplexAttribute(AttributeTemplate):
    key = types.Complex

    def resolve_real(self, ty):
        return ty.underlying_float

    def resolve_imag(self, ty):
        return ty.underlying_float

Note

AttributeTemplate only works for getting attributes. Setting an attribute’s value is hardcoded in numba.typeinfer.

The CallableTemplate base class offers an easier way to parse flexible function signatures, by letting one define a callable that has the same definition as the function being typed. For example, here is how one could hypothetically type Python’s sorted function if Numba supported lists:

@builtin
class Sorted(CallableTemplate):
    key = sorted

    def generic(self):
        def typer(iterable, key=None, reverse=None):
            if reverse is not None and not isinstance(reverse, types.Boolean):
                return
            if key is not None and not isinstance(key, types.Callable):
                return
            if not isinstance(iterable, types.Iterable):
                return
            return types.List(iterable.iterator_type.yield_type)

        return typer

(note you can return just the function’s return type instead of the full signature)

7.2.1.2.4.1. Proposed changes

If we expose some of this, should we streamline the API first? The class-based API can feel clumsy, one could instead imagine a functional API for some of the template kinds:

@type_callable(sorted)
def type_sorted(context):
    def typer(iterable, key=None, reverse=None):
        # [same function as above]

    return typer

7.2.1.3. Code generation

7.2.1.3.1. Concrete representation of values of a Numba type

Any concrete Numba type must be able to be represented in LLVM form (for variable storage, argument passing, etc.). One defines that representation by implementing a datamodel class and registering it with a decorator. Datamodel classes for standard types are defined in numba.datamodel.models.

7.2.1.3.1.1. Proposed changes

No change required.

7.2.1.3.2. Conversion between types

Implicit conversion between Numba types is currently implemented as a monolithic sequence of choices and type checks in the BaseContext.cast() method. To add a new implicit conversion, one appends a type-specific check in that method.

Boolean evaluation is a special case of implicit conversion (the destination type being types.Boolean).

Note

Explicit conversion is seen as a regular operation, e.g. a constructor call.

7.2.1.3.2.1. Proposed changes

Implicit conversion could use some kind of generic function, with multiple dispatch based on the source and destination types.

7.2.1.3.3. Implementation of an operation

Other operations are implemented and registered using a set of generic functions and decorators. For example, here is how lookup for a the .ndim attribute on Numpy arrays is implemented:

@builtin_attr
@impl_attribute(types.Kind(types.Array), "ndim", types.intp)
def array_ndim(context, builder, typ, value):
    return context.get_constant(types.intp, typ.ndim)

And here is how calling len() on a tuple value is implemented:

@builtin
@implement(types.len_type, types.Kind(types.BaseTuple))
def tuple_len(context, builder, sig, args):
    tupty, = sig.args
    retty = sig.return_type
    return context.get_constant(retty, len(tupty.types))

7.2.1.3.3.1. Proposed changes

No changes required. Perhaps review and streamine the API (drop the requirement to write types.Kind(...) explicitly?).

7.2.1.3.4. Conversion from / to Python objects

Some types need to be converted from or to Python objects, if they can be passed as function arguments or returned from a function. The corresponding boxing and unboxing operations are implemented using a generic function. The implementations for standard Numba types are in numba.targets.boxing. For example, here is the boxing implementation for a boolean value:

@box(types.Boolean)
def box_bool(c, typ, val):
    longval = c.builder.zext(val, c.pyapi.long)
    return c.pyapi.bool_from_long(longval)

7.2.1.3.4.1. Proposed changes

Perhaps change the implementation signature slightly, from (c, typ, val) to (typ, val, c), to match the one chosen for the typeof_impl generic function.