Numba Architecture


This document serves two purposes: to introduce other developers to the high-level design of Numba’s internals, and as a point for discussion and synchronization for current Numba developers.

Core Entry Points

Numba has several modes of use:

  1. As a run-time translator of Python functions into low-level functions.
  2. As a call-time specializer of Python functions into low-level functions.
  3. As a run-time builder of extension types.
  4. As a compile-time translator of Python modules into shared object libraries.

The following subsections describe the primary entry points for these modes of use. Each usage mode corresponds to a specific set of definitions provided in the top-level numba module.

Run-time Translation

Users denote run-time translation of a function using the numba.jit() decorator.

Call-time Specialization

Users denote call-time specialization of a function using the numba.autojit() decorator.

Extension Types

Numba supports building extension types using the numba.jit() decorator on a class.

Compile-time Translation

Users denote compile-time translation of a function using the numba.export() and numba.exportmany() decorators.

Translation Internals

Towards More Modular Pipelines

The end goal of building a more modular pipeline is to decouple stages of compilation and make a more modular way of composing transformations.

  • State threaded through the pipeline

    1) AST - Abstract syntax tree, possibly mutated as a side-effect of a pass.

    2) Structured Environment - A dict like object which holds the intermediate forms and data produced as a result of data.

  • Composition of Stages
    • Sequencing
    • Composition Operator
    • Error handling and reporting in pass failure.
  • Pre/Post Condition Checking

    • Stages should have attached pre / post conditions to check the success criterion of the pass for the inputted or resulting ast and environment. Failure to meet this conditions should cause the pipeline to halt.


Note: recursive definitions

jit     := parse o link o jit
pycc    := parse o emit o link
autojit := cache o autojit
cache   := pipeline o jit

blaze   := mapast o jit


Block diagram:
|          pass 1      |
       context     ast
         |          |
  postcondition     |
         |          |
  precondition      |
         |          |
|          pass 2      |
       context     ast
         |          |
  postcondition     |
         |          |
  precondition      |
         |          |
|          pass 3      |
       context     ast
         |          |
  precondition      |
         |          |
         +----------+-----> Output

Discussion: Pipeline Composition

We can do composition in a functional way:

def compose_stages(stage1, stage2):
  def composition(ast, env):
    return stage2(stage1(ast, env), env)
  return composition

pipeline = compose_stages(...compose_stages(parse, ...), ...)

Or, we can do composition using iteration:

for stage in stages:
  ast = stage(ast, env)

Whether the end result is a function or a class is also still up for discussion.

Proposal 1: We replace the Pipeline class to use a list of stages, but these can either be functions or subclasses of the PipelineStage class.

Discussion: Pipeline Environments

Proposal 1: We present an ad hoc environment. This provides the most flexibility for developers to patch the environment as they see fit.

Proposal 2: We present a well defined environment class. The class will have well defined properties that are documented and type-checked when the internal stage checking flag is set.