DictReader¶
-
class
fuzzyfields.
DictReader
(iterable: Iterable, fields: Dict[str, fuzzyfields.fuzzyfield.FuzzyField] = None, *, errors: Union[str, Callable[Exception, Any]] = None, name_map: Dict[str, str] = None)¶ Generic iterable that acquires an iterable of dicts in input, e.g.
csv.DictReader
, and for every input line it yields a line that is filtered, validated and processed depending on the input parameters.- Parameters
iterable – an iterable object, e.g.
csv.DictReader
, that yields dicts of{field : value}
.fields – dict of instance-specific
FuzzyField
objects. You should not use this parameter to set any fields that are known at the time of writing the code, which is the most common use case. Instead, you should create a subclass of DictReader and override the DictReader.fields class attribute.errors –
One of:
- ’raise’ (default)
raise a
ValidationError
on the first line- ’critical’, ‘error’, ‘warning’, ‘info’, ‘debug’
log the error with the matching functions in
logging
and continue- callable(
ValidationError
) invoke a custom callable and continue (unless it itself raises an Exception)
In case errors != ‘raise’ and a FuzzyField raises an exception,
if the field is required, the entire line is discarded
otherwise, the field is replaced with its default value
Alternatively to passing this parameter, you may create a subclass of DictReader and override the DictReader.errors class attribute.
name_map (dict) –
optional dict of
{from name: to name}
renames, where each pair performs a key replacement.Alternatively to passing this parameter, you may create a subclass of DictReader and override the DictReader.name_map class attribute.
-
__init__
(iterable: Iterable, fields: Dict[str, fuzzyfields.fuzzyfield.FuzzyField] = None, *, errors: Union[str, Callable[Exception, Any]] = None, name_map: Dict[str, str] = None)¶ Build new object
-
classmethod
__init_subclass__
()¶ Executed after all subclasses of the current class are defined. Set FuzzyField.name and enrich the docstring of the subclass with the documentation of the fields.
-
__iter__
()¶ Draw dicts from the underlying iterable and yield dicts of
{field name : parsed value}
.
-
__weakref__
¶ list of weak references to the object (if defined)
-
errors
= 'raise'¶ Class level error handling system. Can be overridden with an instance-specific value through the matching
__init__
parameter.
-
fields
= {}¶ Class-level map of
{field name: FuzzyField}
. Overriding this dict is the preferential way to add fields, as they will dynamically build Sphinx documentation. You may add instance-specific fields with the matching__init__
parameter. Override with aOrderedDict
if you need the fields to be parsed in order (this is generally only necessary when one field defines the domain of another).
-
property
line_num
¶ Return line number of underlying file.
- Raises
AttributeError – if the underlying iterator is not a
csv.reader()
,csv.DictReader
, or another duck-type compatible class
-
name_map
= {}¶ Class-level map of field renames. The keys in this dict must be a subset of the keys in the fields dict. You can add to this dict in an instance-specific way by setting the matching
__init__
parameter.
-
postprocess_row
(row: Dict[str, Any]) → Dict[str, Any]¶ Give child classes an opportunity to post-process every row after it’s been parsed by the FuzzyFields. This allows handling special cases and performing cross-field validation.
- Parameters
row – The row as composed by the fields, after name mapping
- Returns
Modified row, or None if the row should be skipped
-
preprocess_row
(row: Any) → Dict[str, Any]¶ Give child classes an opportunity to pre-process every row before feeding it to the FuzzyFields. This allows handling special cases.
You must use this method to manipulate the row if the underlying iterator does not natively yields dicts, e.g. a
csv.reader()
object.- Parameters
row – The row as read by self.iterable, with all names and before name mapping
- Returns
modified row, or None if the row should be skipped
-
record_num
= None¶ Current record (counting from 0), or -1 if the iteration hasn’t started yet.