Fields

class fuzzyfields.String(*, required: bool = True, default: Any = None, description: str = None, unique: bool = False)

Any string value

class fuzzyfields.RegEx(pattern: str, **kwargs)

Validate an input string against a regular expression

Parameters
  • pattern (str) – regular expression pattern string

  • kwargs – parameters to be passed to FuzzyField

class fuzzyfields.ISOCodeAlpha(chars: int = 3, **kwargs)

Letters-only ISO code, e.g. for country or currency. Case insensitive (it will be converted to uppercase).

Parameters
  • chars (int) – Number of characters of the code (default: 3)

  • kwargs – parameters to be passed to FuzzyField

class fuzzyfields.Boolean(*, required: bool = True, default: Any = None, description: str = None, unique: bool = False)

A boolean, any string representation of false/true or no/yes, or 0/1.

class fuzzyfields.Domain(choices: Iterable, *, case_sensitive: bool = True, passthrough: bool = False, **kwargs)

A field which can only accept a specific set of values

Parameters
  • choices – collection of acceptable values. The default needs not be included.

  • case_sensitive (bool) – ignore case when validating string input. The output will be converted to the case listed in choices.

  • passthrough (bool) –

    If True, store the choices object by reference and assume it will change after this class has been initialised. The change will be reflected in the next parsed value.

    Example:

    v1 = String("ID", unique=True)
    v2 = Domain("CrossRef", domain=v1.seen_values, passthrough=True)
    

    In the above example, the field ‘CrossRef’ must be one of the values that already appeared for the field ‘ID’.

    passthrough comes with a performance cost; set it to False (the default) to allow for optimisations. This assumes that neither the choices collection nor the objects it contains will change in the future.

  • kwargs – extra parameters for FuzzyField

class fuzzyfields.Float(*, min_value: Union[int, float] = -inf, max_value: Union[int, float] = inf, allow_min: bool = True, allow_max: bool = True, allow_zero: bool = True, default: Any = nan, **kwargs)

Convert a string representing a number, an int, or other numeric types (e.g. numpy.float64) to float.

Parameters
  • default – Default value. Unlike in all other FuzzyFields, if omitted it is NaN instead of None.

  • min_value – Minimum allowable value. Omit for no minimum.

  • max_value – Maximum allowable value. Omit for no maximum.

  • allow_min (bool) – If True, test that value >= min_value, otherwise value > min_value

  • allow_max (bool) – If True, test that value <= max_value, otherwise value < max_value

  • allow_zero (bool) – If False, test that value != 0

  • kwargs (dict) – parameters to be passed to FuzzyField

class fuzzyfields.Decimal(*, default: Any = Decimal('NaN'), **kwargs)

Convert a number or a string representation of a number to Decimal, which is much much slower and heavier than float but avoids converting 3.1 to 3.0999999.

class fuzzyfields.Integer(*, min_value: Union[int, float] = -inf, max_value: Union[int, float] = inf, allow_min: bool = True, allow_max: bool = True, allow_zero: bool = True, default: Any = nan, **kwargs)

Whole number.

Valid values are:

  • anything that is parsed by the int constructor.

  • floats with strictly trailing zeros (e.g. 1.0000)

  • scientific format as long as there are no digits below 10^0 (1.23e2)

Note

inf and -inf are valid inputs, but in these cases the output will be of type float. To disable them you can use

  • min_value=-math.inf, allow_min=False

  • max_value=math.inf, allow_max=False

NaN is treated as an empty cell, so it is accepted if required=False; in that case the validation will return whatever is set for default, which is math.nan unless overridden, which makes it a third case where the output value won’t be int but float.

Raises

MalformedFieldError – if the number can’t be cast to int without losing precision

class fuzzyfields.Percentage(*, min_value: Union[int, float] = -inf, max_value: Union[int, float] = inf, allow_min: bool = True, allow_max: bool = True, allow_zero: bool = True, default: Any = nan, **kwargs)

Percentage, e.g. 5% or .05

Warning

There’s nothing stopping somebody from writing “35” where it should have been either “35%” or “0.35”. If this field receives “35”, it will return 3500.0. You should use the min_value and max_value parameters of Float to prevent this kind of incidents. Still, nothing will ever protect you from a “1”, which will be converted to 1.00 but the author of the input may have wanted to say 0.01.

class fuzzyfields.Timestamp(*, output: str = 'pandas', required: bool = True, default=None, description: str = None, unique: bool = False, **kwargs)

Parse and check various date and time formats

Note

This field requires pandas.

Parameters
  • output (str) –

    Format of the output value. Possible values are:

    ’pandas’ (default)

    return type is pandas.Timestamp

    Warning

    This format is limited to the period between 1677-09-22 and 2262-04-11, see pandas documentation. Timestamps outside of this range will be automatically coerced to its edges.

    ’datetime’

    return type is datetime.datetime

    ’numpy’

    return type is numpy.datetime64

    any other string

    anything else will be interpreted as a format string for pandas.Timestamp.strftime(); e.g. %Y/%m/%d will produce a string YYYY/MM/DD.

  • required (bool) – See FuzzyField

  • default – See FuzzyField

  • description (str) – See FuzzyField

  • unique (bool) – See FuzzyField

  • kwargs

    Parameters to be passed to pandas.to_datetime().

    Note

    The default is to set dayfirst=True, meaning that in case of ambiguity this function will choose the European format DD/MM/YYYY, whereas the default for pandas.to_datetime() is dayfirst=False (American format MM/DD/YYYY).