This Visitor class is an attempt at unifying the various visitor functions used through the rest of the module, as well as providing a convenient API for working with normalize data structures.
Warning
Cycle detection is still TODO (but close). Be careful if using data structures with cycles.
The Visitor object represents a single recursive visit in progress. You hopefully shouldn’t have to sub-class this class for most use cases; just VisitorPattern.
Create a new Visitor object. Generally called by a front-end class method of VisitorPattern
There are four positional arguments, which specify the particular functions to be used during the visit. The important options from a user of a visitor are the keyword arguments:
- apply_empty_slots=bool
- If set, then your apply method (or reverse, etc) will be called even if there is no corresponding value in the input. Your method will receive the Exception as if it were the value.
- extraneous=bool
- Also call the apply method on properties marked extraneous. False by default.
- ignore_empty_string=bool
- If the ‘apply’ function returns the empty string, treat it as if the slot or object did not exist. False by default.
- ignore_none=bool
- If the ‘apply’ function returns None, treat it as if the slot or object did not exist. True by default.
- visit_filter=MultiFieldSelector
- This supplies an instance of normalize.selector.MultiFieldSelector, and restricts the operation to the matched object fields. Can also be specified as just filter=
Base Class for writing Record visitor pattern classes. These classes are not instantiated, and consist purely of class methods.
There are three visitors supplied by default, which correspond to typical use for IO (normalize.visitor.VisitorPattern.visit() for output, and normalize.visitor.VisitorPattern.cast() for input), and for providing a centralized type catalogue (normalize.visitor.VisitorPattern.reflect()).
visit | cast | reflect | Description |
---|---|---|---|
unpack | grok | scantypes | Defines how to get a property value from the thing being walked, and a generator for the collection. |
apply | reverse | propinfo | Conversion for individual values |
aggregate | collect | itemtypes | Combine collection results |
reduce | produce | typeinfo | Combine apply results |
To customize what is emitted, sub-class VisitorPattern and override the class methods of the conversion you are interested in. For many simple IO use cases, you might need only to override are apply and reverse, if that.
The versions for visit are documented the most thoroughly, as these are the easiest to understand and the ones most users will be customizing. The documentation for the other methods describes the differences between them and their visit counterpart.
A value visitor, which visits instances (typically), applies normalize.visitor.VisitorPattern.apply() to every attribute slot, and returns the reduced result.
Like normalize.diff.diff(), this function accepts a series of keyword arguments, which are passed through to normalize.visitor.Visitor.
This function also takes positional arguments:
- value=object
- The value to visit. Normally (but not always) a normalize.record.Record instance.
- value_type=RecordType
- This is the Record subclass to interpret value as. The default is type(value). If you specify this, then the type information on value is essentially ignored (with the caveat mentioned below on Visitor.map_prop()), and may be a dict, list, etc.
- **kwargs
- Visitor options accepted by normalize.visitor.Visitor.__init__().
Unpack a value during a ‘visit’
args:
- value=object
- The instance being visited
- value_type=RecordType
- The expected type of the instance
- visitor=Visitor
- The context/options
returns a tuple with two items:
- get_prop=function
- This function should take a normalize.property.Property instance, and return the slot from the value, or raise AttributeError or KeyError if the slot is empty.
- get_item=generator
- This generator should return the tuple protocol used by normalize.coll.Collection: (K, V) where K can be an ascending integer (for sequences), V (for sets), or something hashable like a string (for dictionaries/maps)
‘apply’ is a general place to put a function which is called on every extant record slot. This is usually the most important function to implement when sub-classing.
The default implementation passes through the slot value as-is, but expected exceptions are converted to None.
args:
- value=value|AttributeError|KeyError
- This is the value currently in the slot, or the Record itself with the apply_records visitor option. AttributeError will only be received if you passed apply_empty_slots, and KeyError will be passed if parent_obj is a dict (see Visitor.map_prop() for details about when this might happen)
- prop=Property|None
- This is the normalize.Property instance which represents the field being traversed.
- visitor=Visitor
- This object can be used to inspect parameters of the current run, such as options which control which kinds of values are visited, which fields are being visited and where the function is in relation to the starting point.
Hook called for each normalize.coll.Collection, after mapping over each of the items in the collection.
The default implementation calls normalize.coll.Collection.tuples_to_coll() with coerce=False, which just re-assembles the collection into a native python collection type of the same type of the input collection.
args:
- result_coll_generator= generator func
- Generator which returns (key, value) pairs (like normalize.coll.Collection.itertuples())
- coll_type=CollectionType
- This is the normalize.coll.Collection-derived class which is currently being reduced.
- visitor=Visitor
- Context/options object
This reduction is called to combine the mapped slot and collection item values into a single value for return.
The default implementation tries to behave naturally; you’ll almost always get a dict back when mapping over a record, and list or some other collection when mapping over collections.
If the collection has additional properties which are not ignored (eg, not extraneous, not filtered), then the result will be a dictionary with the results of mapping the properties, and a ‘values’ key will be added with the result of mapping the items in the collection.
args:
- mapped_props=generator
- Iterating over this generator will yield K, V pairs, where K is the Property object and V is the mapped value.
- aggregated=object
- This contains whatever aggregate returned, normally a list.
- value_type=RecordType
- This is the type which is currently being reduced. A normalize.record.Record subclass
- visitor=Visitor
- Contenxt/options object.
Cast is for visitors where you are visiting some random data structure (perhaps returned by a previous VisitorPattern.visit() operation), and you want to convert back to the value type.
This function also takes positional arguments:
- value_type=RecordType
- The type to cast to.
value=object
- visitor=Visitor.Options
- Specifies the visitor options, which customizes the descent and reduction.
Like normalize.visitor.VisitorPattern.unpack() but called for cast operations. Expects to work with dictionaries and lists instead of Record objects.
Reverses the transform performed in normalize.visitor.VisitorPattern.reduce() for collections with properties.
If you pass tuples to isa of your Properties, then you might need to override this function and throw TypeError if the passed value_type is not appropriate for value.
Like normalize.visitor.VisitorPattern.apply() but called for cast operations. The default implementation passes through but squashes exceptions, just like apply.
Like normalize.visitor.VisitorPattern.aggregate(), but coerces the mapped values to the collection item type on the way through.
Like normalize.visitor.VisitorPattern.reduce(), but constructs instances rather than returning plain dicts.
Reflect is for visitors where you are exposing some information about the types reachable from a starting type to an external system. For example, a front-end, a REST URL router and documentation framework, an avro schema definition, etc.
X can be a type or an instance.
This API should be considered experimental
Like normalize.visitor.VisitorPattern.unpack(), but returns a getter which just returns the property, and a collection getter which returns a set with a single item in it.
Like normalize.visitor.VisitorPattern.apply(), but takes a property and returns a dict with some basic info. The default implementation returns just the name of the property and the type in here.
Like normalize.visitor.VisitorPattern.aggregate(), but returns . This will normally only get called with a single type.
Like normalize.visitor.VisitorPattern.reduce(), but returns the final dictionary to correspond to a type definition. The default implementation returns just the type name, the list of properties, and the item type for collections.
This sentinel value may be returned by a custom implementation of unpack (or grok, or scantypes) to indicate that the descent should be stopped immediately, instead of proceeding to descend into sub-properties. It can be passed a literal value to use as the mapped value as a single constructor argument, or the class itself returned to indicate no mapped value.
The common visitor API used by all three visitor implementations.
args:
- visitor=Visitor
- Visitor options instance: contains the callbacks to use to implement the visiting, as well as traversal & filtering options.
- value=Object
- Object being visited
- value_type=RecordType
- The type object controlling the visiting.