Type System

Implements a simple, dynamic type system for API generation.

author:Anthony Scopatz <scopatz@gmail.com>

Introduction

This module provides a suite of tools for denoting, describing, and converting between various data types and the types coming from various systems. This is achieved by providing canonical abstractions of various kinds of types:

  • Base types (int, str, float, non-templated classes)
  • Refined types (even or odd ints, strings containing the letter ‘a’)
  • Dependent types (templates such arrays, maps, sets, vectors)

All types are known by their name (a string identifier) and may be aliased with other names. However, the string id of a type is not sufficient to fully describe most types. The system here implements a canonical form for all kinds of types. This canonical form is itself hashable, being comprised only of strings, ints, and tuples.

Canonical Forms

First, let us examine the base types and the forms that they may take. Base types are fiducial. The type system itself may not make any changes (refinements, template filling) to types of this kind. They are basically a collection of bits. (The job of ascribing meaning to these bits falls on someone else.) Thus base types may be referred to simply by their string identifier. For example:

'str'
'int32'
'float64'
'MyClass'

Aliases to these – or any – type names are given in the type_aliases dictionary:

type_aliases = {
    'i': 'int32',
    'i4': 'int32',
    'int': 'int32',
    'complex': 'complex128',
    'b': 'bool',
    }

Furthermore, length-2 tuples are used to denote a type and the name or flag of its predicate. A predicate is a function or transformation that may be applied to verify, validate, cast, coerce, or extend a variable of the given type. A common usage is to declare a pointer or reference of the underlying type. This is done with the string flags ‘*’ and ‘&’:

('char', '*')
('float64', '&')

If the predicate is a positive integer, then this is interpreted as a homogeneous array of the underlying type with the given length. If this length is zero, then the tuple is often interpreted as a scalar of this type, equivalent to the type itself. The length-0 scalar interpretation depends on context. Here are some examples of array types:

('char', 42)  # length-42 character array
('bool', 1)   # length-1 boolean array
('f8', 0)     # scalar 64-bit float

Note

length-1 tuples are converted to length-2 tuples with a 0 predicate, i.e. ('char',) will become ('char', 0).

The next kind of type are refinement types or refined types. A refined type is a sub-type of another type but restricts in some way what constitutes a valid element. For example, if we first take all integers, the set of all positive integers is a refinement of the original. Similarly, starting with all possible strings the set of all strings starting with ‘A’ is a refinement.

In the system here, refined types are given their own unique names (e.g. ‘posint’ and ‘astr’). The type system has a mapping (refined_types) from all refinement type names to the names of their super-type. The user may refer to refinement types simply by their string name. However the canonical form refinement types is to use the refinement as the predicate of the super-type in a length-2 tuple, as above:

('int32', 'posint')  # refinement of integers to positive ints
('str', 'astr')      # refinement of strings to str starting with 'A'

It is these refinement types that give the second index in the tuple its ‘predicate’ name. Additionally, the predicate is used to look up the converter and validation functions in when doing code generation or type verification.

The last kind of type are known as dependent types or template types, similar in concept to C++ template classes. These are meta-types whose instantiation requires one or more parameters to be filled in by further values or types. Dependent types may nest with themselves or other dependent types. Fully qualifying a template type requires the resolution of all dependencies.

Classic examples of dependent types include the C++ template classes. These take other types as their dependencies. Other cases may require only values as their dependencies. For example, suppose we want to restrict integers to various ranges. Rather than creating a refinement type for every combination of integer bounds, we can use a single ‘intrange’ type that defines high and low dependencies.

The template_types mapping takes the dependent type names (e.g. ‘map’) to a tuple of their dependency names (‘key’, ‘value’). The refined_types mapping also accepts keys that are tuples of the following form:

('<type name>', '<dep0-name>', ('dep1-name', 'dep1-type'), ...)

Note that template names may be reused as types of other template parameters:

('name', 'dep0-name', ('dep1-name', 'dep0-name'))

As we have seen, dependent types may either be base types (when based off of template classes), refined types, or both. Their canonical form thus follows the rules above with some additional syntax. The first element of the tuple is still the type name and the last element is still the predicate (default 0). However the type tuples now have a length equal to 2 plus the number of dependencies. These dependencies are placed between the name and the predicate: ('<name>', <dep0>, ..., <predicate>). These dependencies, of course, may be other type names or tuples! Let’s see some examples.

In the simplest case, take analogies to C++ template classes:

('set', 'complex128', 0)
('map', 'int32', 'float64', 0)
('map', ('int32', 'posint'), 'float64', 0)
('map', ('int32', 'posint'), ('set', 'complex128', 0), 0)

Now consider the intrange type from above. This has the following definition and canonical form:

refined_types = {('intrange', ('low', 'int32'), ('high', 'int32')): 'int32'}

# range from 1 -> 2
('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))

# range from -42 -> 42
('int32', ('intrange', ('low', 'int32', -42), ('high', 'int32', 42)))

Note that the low and high dependencies here are length three tuples of the form ('<dep-name>', <dep-type>, <dep-value>). How the dependency values end up being used is solely at the discretion of the implementation. These values may be anything, though they are most useful when they are easily convertible into strings in the target language.

Warning

Do not confuse length-3 dependency tuples with length-3 type tuples! The last element here is a value, not a predicate.

Next, consider a ‘range’ type which behaves similarly to ‘intrange’ except that it also accepts the type as dependency. This has the following definition and canonical form:

refined_types = {('range', 'vtype', ('low', 'vtype'), ('high', 'vtype')): 'vtype'}

# integer range from 1 -> 2
('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))    

# positive integer range from 42 -> 65
(('int32', 'posint'), ('range', ('int32', 'posint'),
                                ('low', ('int32', 'posint'), 42),
                                ('high', ('int32', 'posint'), 65)))

Shorthand Forms

The canonical forms for types contain all the information needed to fully describe different kinds of types. However, as human-facing code, they can be exceedingly verbose. Therefore there are number of shorthand techniques that may be used to also denote the various types. Converting from these shorthands to the fully expanded version may be done via the the canon(t) function. This function takes a single type and returns the canonical form of that type. The following are operations that canon() performs:

  • Base type are returned as their name:

    canon('str') == 'str'
    
  • Aliases are resolved:

    canon('f4') == 'float32'
    
  • Expands length-1 tuples to scalar predicates:

    t = ('int32',)
    canon(t) == ('int32', 0)
    
  • Determines the super-type of refinements:

    canon('posint') == ('int32', 'posint')
    
  • Applies templates:

    t = ('set', 'float')
    canon(t) == ('set', 'float64', 0)
    
  • Applies dependencies:

    t = ('intrange', 1, 2)
    canon(t) = ('int32', ('intrange', ('low', 'int32', 1), ('high', 'int32', 2)))
    
    t = ('range', 'int32', 1, 2)
    canon(t) = ('int32', ('range', 'int32', ('low', 'int32', 1), ('high', 'int32', 2)))
    
  • Performs all of the above recursively:

    t = (('map', 'posint', ('set', ('intrange', 1, 2))),)
    canon(t) == (('map', 
                 ('int32', 'posint'),  
                 ('set', ('int32', 
                    ('intrange', ('low', 'int32', 1), ('high', 'int32', 2))), 0)), 0)
    

These shorthands are thus far more useful and intuitive than canonical form described above. It is therefore recommended that users and developers write code that uses the shorter versions, Note that canon() is guaranteed to return strings, tuples, and integers only – making the output of this function hashable.

Type System API

bright.apigen.typesystem.canon(*args, **kwargs)

Turns the type into its canonical form. See module docs for more information.

bright.apigen.typesystem.cython_c2py(*args, **kwargs)

Given a varibale name and type, returns cython code (declaration, body, and return statements) to convert the variable from C/C++ to Python.

bright.apigen.typesystem.cython_cimport_tuples(*args, **kwargs)

Given a type t, and possibily previously seen cimport tuples (set), return the set of all seen cimport tuples. These tuple have four possible interpretations based on the length and values:

  • (module-name,) becomes cimport {module-name}
  • (module-name, var-or-mod) becomes from {module-name} cimport {var-or-mod}
  • (module-name, var-or-mod, alias) becomes from {module-name} cimport {var-or-mod} as {alias}
  • (module-name, 'as', alias) becomes cimport {module-name} as {alias}
bright.apigen.typesystem.cython_cimports(*args, **kwargs)

Retuns the cimport lines associtated with a type or a set of seen tuples.

bright.apigen.typesystem.cython_ctype(*args, **kwargs)

Given a type t, returns the cooresponding Cython C type declaration.

bright.apigen.typesystem.cython_cytype(*args, **kwargs)

Given a type t, returns the cooresponding Cython type.

bright.apigen.typesystem.cython_import_tuples(*args, **kwargs)

Given a type t, and possibily previously seen import tuples (set), return the set of all seen import tuples. These tuple have four possible interpretations based on the length and values:

  • (module-name,) becomes import {module-name}
  • (module-name, var-or-mod) becomes from {module-name} import {var-or-mod}
  • (module-name, var-or-mod, alias) becomes from {module-name} import {var-or-mod} as {alias}
  • (module-name, 'as', alias) becomes import {module-name} as {alias}
bright.apigen.typesystem.cython_imports(*args, **kwargs)

Retuns the import lines associtated with a type or a set of seen tuples.

bright.apigen.typesystem.cython_nptype(*args, **kwargs)

Given a type t, returns the cooresponding NumPy type.

bright.apigen.typesystem.cython_py2c(*args, **kwargs)

Given a varibale name and type, returns cython code (declaration, body, and return statement) to convert the variable from Python to C/C++.

bright.apigen.typesystem.cython_pytype(*args, **kwargs)

Given a type t, returns the cooresponding Python type.

bright.apigen.typesystem.deregister_class(name)

This function will remove a previously registered class from the type system.

bright.apigen.typesystem.deregister_refinement(name)

This function will remove a previously registered refinement from the type system.

bright.apigen.typesystem.deregister_specialization(t)

This function will remove previously registered template specialization.

bright.apigen.typesystem.isdependent(*args, **kwargs)

Returns whether t is a dependent type or not.

bright.apigen.typesystem.isrefinement(*args, **kwargs)

Returns whether t is a refined type.

bright.apigen.typesystem.istemplate(*args, **kwargs)

Returns whether t is a template type or not.

bright.apigen.typesystem.register_class(name, template_args=None, cython_c_type=None, cython_cimport=None, cython_cy_type=None, cython_py_type=None, cython_template_class_name=None, cython_cyimport=None, cython_pyimport=None, cython_c2py=None, cython_py2c=None)

Classes are user specified types. This function will add a class to the type system so that it may be used normally with the rest of the type system.

bright.apigen.typesystem.register_refinement(name, refinementof, cython_cimport=None, cython_cyimport=None, cython_pyimport=None, cython_c2py=None, cython_py2c=None)

This function will add a refinement to the type system so that it may be used normally with the rest of the type system.

bright.apigen.typesystem.register_specialization(t, cython_c_type=None, cython_cy_type=None, cython_py_type=None, cython_cimport=None, cython_cyimport=None, cython_pyimport=None)

This function will add a template specialization so that it may be used normally with the rest of the type system.

bright.apigen.typesystem.BASE_TYPES = set(['uint64', 'int32', 'float64', 'void', 'complex128', 'char', 'int64', 'str', 'bool', 'uint32', 'float32'])

Base types in the type system.

bright.apigen.typesystem.refined_types = {'nucid': 'int32', 'nucname': 'str'}

This is a mapping from refinement type names to the parent types. The parent types may either be base types, compound types, template types, or other refined types!

bright.apigen.typesystem.template_types = {'pair': ('key_type', 'value_type'), 'map': ('key_type', 'value_type'), 'set': ('value_type',), 'dict': ('key_type', 'value_type'), 'vector': ('value_type',)}

Template types are types whose instantiations are based on meta-types. this dict maps their names to meta-type names in order.

bright.apigen.typesystem.type_aliases = {'i8': 'int64', 'NPY_COMPLEX128': 'complex128', 'f4': 'float32', 'string': 'str', 'np.NPY_BYTE': 'char', 'int': 'int32', 'float': 'float64', 'i4': 'int32', 'NPY_BOOL': 'bool', 'np.NPY_BOOL': 'bool', 'np.NPY_STRING': 'str', 'np.NPY_VOID': 'void', 'uint': 'uint32', 'ui4': 'uint32', 'np.NPY_COMPLEX128': 'complex128', 'ui8': 'uint64', 'b': 'bool', 'f': 'float64', 'i': 'int32', 'np.NPY_FLOAT32': 'float32', 'np.NPY_INT32': 'int32', 'np.NPY_FLOAT64': 'float64', 'NPY_FLOAT32': 'float32', 'np.NPY_OBJECT': 'void', 'NPY_INT32': 'int32', 'f8': 'float64', 's': 'str', 'complex': 'complex128', 'ui': 'uint32', 'NPY_OBJECT': 'void', 'v': 'void', 'NPY_VOID': 'void', 'NPY_FLOAT64': 'float64', 'NPY_BYTE': 'char', 'NPY_UINT32': 'uint32', 'np.NPY_UINT32': 'uint32', 'NPY_STRING': 'str'}

Aliases that may be used to subsitute one type name for another.

Previous topic

API Generation

Next topic

Automatic Descriptions

This Page