Defining Records - normalize.record

Record Flavors

Record subclasses don’t have any special mix-in magic like Property subclasses do. To use a Record subclass you must explicitly derive it. The most useful classes for general use are normalize.record.json.JsonRecord, normalize.record.RecordList and the combination, normalize.record.json.JsonRecordList.

Here’s a guide to choosing a class:

JsonRecord

This is a good choice for codebases where you typically want to pass the JSON (string or data) as the first positional argument to a constructor. There is one JSON mapping which is the default, and it is accessed by .json_data() (or the constructor). It also has an unknown_json_keys property, which is is marked as ‘extraneous’ and contains all of the input keys which were not known.

The specifics of this conversion can be overridden in many ways; you can set json_name on the property name to override the default of using the python property name. You can also provide json_in and json_out lambda functions to transform the values on the way in and out.

More ‘radical’ transforms can be achieved by overriding json_data in your class for marshal out, and json_to_initkwargs for marshall in.

Record

This version is a good choice where the sources of data are typically other python functions. The default is a conversion from a dict, but the keys are the actual property names, not the json_name versions like JsonRecord.

It’s also good for object types which have multiple JSON mappings, where you don’t want to make any of them more special than any other. Instead of using the .json_data() function, you might for instance be using normalize.visitor.Visitor.

You can still access the same conversion as with JsonRecord by using the normalize.record.json.to_json() and normalize.record.json.from_json() functions directly. These functions will respect any JsonProperty hints (json_name etc), even if those hints are on properties which are not a part of a JsonRecord class.

RecordList

The RecordList type is a specialization of Record, which expects its first constructor argument to be a list. Internally this is just a Record which has a values property, and this property is the actual collection. This values property isn’t currently present in the type(Foo).properties dictionary, but may some day show up there.

In theory, RecordList is just one of possibly many types of normalize.coll.Collection sets which happens to use continuous integers as the key. Each collection type provides methods which know how to interpret iterable things and convert them into an iterable of (K, V) pairs. In practice, the list is the only type which is currently reasonably implemented or tested.

This base class implements the various methods expected of collections (iterators, getitem, etc). It mostly just passes these down to the underlying values, but you can expect this behavior to get more typesafe as time goes on (so that you can’t, for instance, add records of the wrong type to a list).

JsonRecordList

This is really just a RecordList sub-type which uses json_to_initkwargs to handle the conversion from and from a list. So by default, the constructor expects a list, and the json_data method returns one, too. You can also use this type for when you’re dealing with Json data which is logically a list, but actually is a dict, with some top-level keys, and one of the keys with the list in it as well, by customizing the json_to_initkwargs method. eg, a “next page” URL key, resume token, or a count of the size of the set.

That’s a summary overview; apologies if I end up repeating myself over the rest of this page!

Record

class normalize.record.Record(init_dict=None, **kwargs)[source]

Base class for normalize instances and collections.

__init__(init_dict=None, **kwargs)[source]

Instantiates a new Record type.

You may specify a dict to initialize from, or use keyword argument form. The default interpretation of the first positional argument is to treat it as if its contents were passed in as keyword arguments.

Subclass API: subclasses are permitted to interpret positional arguments in non-standard ways, but in general it is expected that if keyword arguments are passed, then they are already of the right type (or, of a type that the coerce functions associated with the properties can accept). The only exception to this is if init_dict is an OhPickle instance, you should probably just return (see OhPickle)

__getnewargs__()[source]

Stub method which arranges for an OhPickle instance to be passed to the constructor above when pickling out.

__getstate__()[source]

Implement saving, for the pickle out API. Returns the instance dict

__setstate__(instance_dict)[source]

Implement loading, for the pickle in API. Sets the instance dict directly.

__str__()[source]

Marshalling to string form. This is what you see if you cast the object to a string or use the %s format code, and is supposed to be an “informal” representation when you don’t want a full object dump like repr() would provide.

If you defined a primary_key for the object, then the values of the attributes you specified will be included in the string representation, eg <Task 17>. Otherwise, the implicit primary key (eg, a tuple of all of the defined attributes with their values) is included, up to the first 30 characters or so.

__repr__()[source]

Marshalling to Python source. This is what you will see when printing objects using the %r format code. This function is recursive and should generally satisfy the requirement that it is valid Python source, assuming all class names are in scope and all values implement __repr__ as suggested in the python documentation.

__eq__(other)[source]

Compare two Record classes; recursively compares all attributes for equality (except those marked ‘extraneous’). See also diff() for a version where the comparison can be fine-tuned.

__ne__(other)[source]

implemented for compatibility

__pk__[source]

This property returns the “primary key” for this object. This is similar to what is used when comparing Collections via normalize.diff, and is used for stringification and for the id() built-in.

__hash__()[source]

Implements id() for Record types.

Collection Types

Collections are types of records, with a single property values which contains the underlying collection object.

class normalize.coll.Collection(values=None, **kwargs)[source]

This is the base class for Record property values which contain iterable sets of values with a particular common type.

All collections are modeled as a mapping from some index to a value. Bags are not currently supported, ie the keys must be unique for the diff machinery as currently written to function.

The type of members in the collection is defined using a sub-class API:

classproperty itemtype=Record sub-class
This property must be defined for instances to be instantiable. It declares the type of values.
classproperty coerceitem=FUNC
Defaults to itemtype and is the function or constructor which will accept items during construction of a collection which are not already of the correct type.
classproperty colltype=collection type
This is the type of the underlying container. list, dict, etc.
__init__(values=None, **kwargs)[source]

Default collection constructor.

args:

values=iterable
Specify the initial contents of the collection. It will be converted to the correct type using coll_to_tuples() and tuples_to_coll()
attribute=VALUE
It is possible to add extra properties to Collection objects; this is how you specify them on construction.
__eq__(other)[source]

To be ==, collections must have exactly the same itemtype and colltype, and equal values

__ne__(other)[source]

Implemented, for compatibility

classmethod coerce_tuples(generator)[source]

This class method converts a generator of (K, V) tuples (the tuple protocol), where V is not yet of the correct type, to a generator where it is of the correct type (using the coerceitem class property)

classmethod tuples_to_coll(generator, coerce=False)[source]

required virtual method This class method, part of the sub-class API, converts a generator of (K, V) tuples (the tuple protocol) to one of the underlying collection type.

itertuples()[source]

Iterate over the items in the collection; return (k, v) where k is the key, index etc into the collection (or potentially the value itself, for sets). This form is the tuple protocol

classmethod coll_to_tuples(coll)[source]

Generate ‘conformant’ tuples from an input collection, similar to itertuples

class normalize.coll.DictCollection(values=None, **kwargs)[source]

An implementation of keyed collections which obey the Record property protocol and the tuple collection protocol. Warning: largely untested, patches welcome.

colltype

alias of dict

class normalize.coll.ListCollection(values=None, **kwargs)[source]

An implementation of sequences which obey the Record property protocol and the tuple collection protocol.

colltype

alias of list

classmethod coll_to_tuples(coll)[source]

coll_to_tuples is capable of unpacking its own collection types (list), collections.Mapping objects, as well generators, sequences and iterators. Returns (*int*, Value). Does not coerce items.

append(item)[source]

Sequence API, currently passed through to underlying collection. Type-checking is currently TODO.

__str__()[source]

Informal stringification returns the type of collection, and the length. For example, <MyRecordList: 8 item(s)>

__repr__()[source]

Implemented: prints a valid constructor.

JsonRecord

normalize.record.json.json_to_initkwargs(record_type, json_struct, kwargs=None)[source]

This function converts a JSON dict (json_struct) to a set of init keyword arguments for the passed Record (or JsonRecord).

It is called by the JsonRecord constructor. This function takes a JSON data structure and returns a keyword argument list to be passed to the class constructor. Any keys in the input dictionary which are not known are passed as a single unknown_json_keys value as a dict.

This function should generally not be called directly, except as a part of a __init__ or specialized visitor application.

normalize.record.json.from_json(record_type, json_struct)[source]

JSON marshall in function: a ‘visitor’ function which looks for JSON types/hints on types being converted to, but does not require them.

Args:
record_type=TYPE
Record type to convert data to
json_struct=DICT|LIST
a loaded (via json.loads) data structure, normally a dict or a list.
normalize.record.json.to_json(record, extraneous=True, prop=None)[source]

JSON conversion function: a ‘visitor’ function which implements marshall out (to JSON data form), honoring JSON property types/hints but does not require them. To convert to an actual JSON document, pass the return value to json.dumps or a similar function.

args:
record=anything
This object can be of any type; a best-effort attempt is made to convert to a form which json.dumps can accept; this function will call itself recursively, respecting any types which define .json_data() as a method and calling that.
extraneous=BOOL
This parameter is passed through to any json_data() methods which support it.
prop=PROPNAME|PROPERTY
Specifies to return the given property from an object, calling any to_json mapping defined on the property. Does not catch the AttributeError that is raised by the property not being set.
class normalize.record.json.JsonRecord(json_data=None, **kwargs)[source]

Version of a Record which deals primarily in JSON form.

This means:

  1. The first argument to the constructor is assumed to be a JSON data dictionary, and passed through the class’ json_to_initkwargs method before being used to set actual properties
  2. Unknown keys are permitted, and saved in the “unknown_json_keys” property, which is merged back on output (ie, calling .json_data() or to_json())
__init__(json_data=None, **kwargs)[source]

Build a new JsonRecord sub-class.

args:
json_data=DICT|other
JSON data (string or already json.loads‘d). If not a JSON dictionary with keys corresponding to the json_name or the properties within, then json_to_initkwargs should be overridden to handle the unpacking differently
**kwargs
JsonRecord instances may also be constructed by passing in attribute initializers in keyword form. The keys here should be the names of the attributes and the python values, not the JSON names or form.
classmethod json_to_initkwargs(json_data, kwargs)[source]

Subclassing hook to specialize how JSON data is converted to keyword arguments

classmethod from_json(json_data)[source]

This method can be overridden to specialize how the class is loaded when marshalling in; however beware that it is not invoked when the caller uses the from_json() function directly.

json_data(extraneous=False)[source]

Returns the JSON data form of this JsonRecord. The ‘unknown’ JSON keys will be merged back in, if:

  1. the extraneous=True argument is passed.
  2. the unknown_json_keys property on this class is replaced by one not marked as extraneous
diff_iter(other, **kwargs)[source]

Generator method which returns the differences from the invocant to the argument. This specializes Record.diff_iter() by returning JsonDiffInfo objects.

diff(other, **kwargs)[source]

Compare an object with another. This specializes Record.diff() by returning a JsonDiff object.

class normalize.record.json.JsonRecordList(json_data=None, **kwargs)[source]

Version of a RecordList which deals primarily in JSON

__init__(json_data=None, **kwargs)[source]

Build a new JsonRecord sub-class.

Args:
json_data=LIST|other
JSON data (string or already json.loads‘d)
**kwargs
Other initializer attributes, for lists with extra attributes (eg, paging information)
class normalize.record.json.JsonDiffInfo(json_data=None, **kwargs)[source]

Version of ‘DiffInfo’ that supports .json_data()

class normalize.record.json.JsonDiff(json_data=None, **kwargs)[source]

Version of ‘Diff’ that supports .json_data()

itemtype

alias of JsonDiffInfo

Customizing JsonRecord Marshalling

JSON record marshalling can be overridden in two important ways:

  1. By specifying json_in (or just coerce if you want to be able to pass in values like this from Python as well) and json_out on your Properties, and the json_name key.

    Remember, you don’t need to explicitly instantiate JsonProperty objects; you can throw on these JSON specialization flags on any property and normalize.property.meta will mix them in for you.

    See Declaring JSON hints on Properties for more details.

  2. Via the sub-class API. The most convenient hooks are JsonRecord.json_to_initkwargs() (you must derive JsonRecord for this) and JsonRecord.json_data() (any class can define this). This is a more general API which can perform more ‘drastic’ conversions, for instance if you want to marshall your class to a JSON Array.

Record MetaClass

class normalize.record.meta.RecordMeta[source]

Metaclass for Record types.

static __new__(mcs, name, bases, attrs)[source]

Invoked when a new Record type is declared, and is responsible for copying the properties from superclass Record classes, processing the primary_key declaration, and calling normalize.property.Property.bind() to link normalize.property.Property instances to their containing normalize.record.Record classes.