Comparing objects can be done using the normalize.diff.diff() function or normalize.diff.diff_iter(), or by calling the instance methods: normalize.record.Record.diff() or normalize.record.Record.diff_iter()
The iterative versions return DiffInfo records, and the functional version returns a Diff. These objects are instances of Record and RecordList.
All of the diff functions and methods take a single ‘other’ object, as well as keyword arguments to customize the diff operation; these are passed to the DiffOptions constructor and the result is passed recursively to itself to compare deeply. The exception to this is the keyword argument options=, which specifies a pre-constructed, perhaps derived DiffOptions instance.
There are some examples of this in Comparing object structures, and more in the normalize test suite.
Eager version of diff_iter(), which takes all the same options and returns a Diff instance.
Compare a Record with another object (usually a record of the same type), and yield differences as DiffInfo instances.
Bases: normalize.coll.ListCollection
Container for a list of differences.
Type name of the source object
Type name of the compared object; normally the same, unless the duck_type option was specified.
alias of DiffInfo
Container for storing diff information that can be used to reconstruct the values diffed.
Enumeration describing the type of difference; a DiffType value.
A FieldSelector object referring to the location within the base object that the changed field was found. If the diff_type is DiffTypes.ADDED, then this will be the location of the record the field was added in, not the (non-existant) field itself.
A FieldSelector object referring to the location within the ‘other’ object that the changed field was found. If the diff_type is DiffTypes.REMOVED, then this will be location of the record the field was removed from, not the (non-existant) field itself.
A richenum.OrderedRichEnum type to denote the type of an individual difference.
Optional data structure to pass diff options down. Some functions are delegated to this object, allowing for further customization of operation, forming the DiffOptions sub-class API.
Create a new DiffOptions instance.
args:
- ignore_ws=BOOL
- Ignore whitespace in strings (beginning, end and middle). True by default.
- ignore_case=BOOL
- Ignore case differences in strings. False by default.
- unicode_normal=BOOL
- Ignore unicode normal form differences in strings by normalizing to NFC before comparison. True by default.
- unchanged=BOOL
- Yields DiffInfo objects for every comparison, not just those which found a difference. Defaults to False. Useful for testing.
- ignore_empty_slots=BOOL
- If true, slots containing typical ‘empty’ values (by default, just '' and None) are treated as if they were not set. False by default.
- ignore_empty_items=BOOL
- If true, items are considered to be absent from collections if they have all None, not set, or '' in their primary key fields (all compared fields in the absence of a primary key definition). False by default.
- duck_type=BOOL
Normally, types must match or the result will always be normalize.diff.DiffTypes.MODIFIED and the comparison will not descend further.
However, setting this option bypasses this check, and just checks that the ‘other’ object has all of the properties defined on the ‘base’ type. This can be used to check progress when porting from other object systems to normalize.
- fuzzy_match=BOOL
- Enable approximate matching of items in collections, so that finer granularity of changes are available.
- compare_filter=MULTIFIELDSELECTOR|LIST_OF_LISTS
- Restrict comparison to the fields described by the passed MultiFieldSelector (or list of FieldSelector lists/objects)
Sub-class hook which performs value comparison. Only called for comparisons which are not Records.
Normalizes Unicode Normal Form (to NFC); called if unicode_normal is true.
This method decides whether the value is ‘empty’, and hence the same as not specified. Called if ignore_empty_slots is true. Checking the value for emptiness happens after all other normalization.
This hook is called by DiffOptions.normalize_val() if the value (after slot/item normalization) is a string, and is responsible for calling the various normalize_foo methods which act on text.
Hook which is called on every value before comparison, and should return the scrubbed value or self._nothing to indicate that the value is not set.
Hook which is called on every record slot; this is a way to perform context-aware clean-ups.
args:
- value=nothing|anything
- The value in the slot. nothing can be detected in sub-class methods as self._nothing.
- prop=PROPERTY
- The slot’s normalize.property.Property instance. If this instance has a compare_as method, then that method is called to perform a clean-up before the value is passed to normalize_val
Hook which is called on every collection item; this is a way to perform context-aware clean-ups.
args:
- value=nothing|anything
- The value in the collection slot. nothing can be detected in sub-class methods as self._nothing.
- coll=COLLECTION
- The parent normalize.coll.Collection instance. If this instance has a compare_item_as method, then that method is called to perform a clean-up before the value is passed to normalize_val
- index=HASHABLE
- The key of the item in the collection.
These functions call each other recursively, depending on the value encountered during walking the base object. They are documented here to give insight into the workings of more user-facing APIs such as diff, but should be considered to be implementation details.
This generator function compares a record, slot by slot, and yields differences found as DiffInfo objects.
args:
- a=Record
- The base object
- b=Record|object
- The ‘other’ object, which must be the same type as a, unless options.duck_type is set.
- fs_a=*FieldSelector*
- The current diff context, prefixed to any returned base field in yielded DiffInfo objects. Defaults to an empty FieldSelector.
- fs_b=*FieldSelector*
- The other object context. This will differ from fs_a in the case of collections, where a value has moved slots. Defaults to an empty FieldSelector.
- options=*DiffOptions*
- A constructed DiffOptions object; a default one is created if not passed in.
Generator function to compare two collections, and yield differences. This function does not currently report moved items in collections, and uses the DiffOptions.record_id() method to decide if objects are to be considered the same, and differences within returned.
Arguments are the same as compare_record_iter().
Note that diff_iter and compare_record_iter will call both this function and compare_record_iter on RecordList types. However, as most RecordList types have no extra properties, no differences are yielded by the compare_record_iter method.
Generator for comparing ‘simple’ lists when they are encountered. This does not currently recurse further. Arguments are as per other compare_X functions.
Generator for comparing ‘simple’ dicts when they are encountered. This does not currently recurse further. Arguments are as per other compare_X functions.
This function returns a generator which iterates over the collection, similar to Collection.itertuples(). Collections are viewed by this module, regardless of type, as a mapping from an index to the value. For sets, the “index” is the value itself (ie, (V, V)). For dicts, it’s a string, and for lists, it’s an int.
In general, this function defers to itertuples and/or iteritems methods defined on the instances; however, when duck typing, this function typically provides the generator.