unary_operations
Unary (one reference argument) transformations.
This module contains the base class and concrete classes for unary transformations, those that take a single reference argument (i.e. a column or transformation name).
Classes
InclusionTransformation
class InclusionTransformation( *, name: str = None, output: bool = False, arg: str, in_str: str,):
Represents the test for substring inclusion in a column's entries.
Check whether in_str
(the test string) is in the elements of arg
(the column).
Arguments
arg
: The argument to the transformation as a string.in_str
: The string to test for inclusion.name
: The name of the transformation. If not provided a unique name will be generated from the class name.output
: Whether or not this transformation should be included in the final output. Defaults to False.
Raises
TransformationRegistryError
: If the transformation name is already in use.TransformationRegistryError
: If the transformation name hasn't been provided and the transformation is not registered.
Method generated by attrs for class InclusionTransformation.
Variables
- static
in_str : str
Static methods
schema
def schema() ‑> marshmallow.schema.Schema:
Inherited from:
Gets an instance of the Schema associated with this Transformation.
Raises
TypeError
: If the transformation doesn't have aTransformationSchema
as the schema.
OneHotEncodingTransformation
class OneHotEncodingTransformation( *, name: str = None, output: bool = False, arg: str, unknown_suffix: str = 'UNKNOWN', raw_values: Union[list[Any], dict[Any, Optional[str]]],):
One hot encoding transformation.
Represents the transformation of a column into a series of one-hot encoded columns.
Arguments
-
arg
: Column or transformation reference to one-hot encode. -
name
: The name of the transformation. If not provided a unique name will be generated from the class name. -
output
: Whether or not this transformation should be included in the final output. Defaults to False. -
unknown_suffix
: The suffix to use to create a column for encoding unknown values. The column will be created as{name}_{unknown_suffix}
. Default is "UNKNOWN". -
values
: Column values that should be one-hot encoded. This can either be a list of values, in which case the one-hot encoding will produce columns named{name}_{value}
, or a dictionary of values to desired column suffixes, in which case the encoding will use those suffixes (if an entry in the dictionary maps to None, the column name will be generated in the same way as described above).If
name
is not set, the column or transformation reference fromarg
will be used instead.Any value found in the column which is not enumerated in this argument will be encoded in an
{name}_{unknown_suffix}
column. This column is therefore protected and any value or value-column mapping that could clash will raise ValueError. If you need to encode such a value,unknown_suffix
must be changed
Raises
TransformationRegistryError
: If the transformation name is already in use.TransformationRegistryError
: If the transformation name hasn't been provided and the transformation is not registered.ValueError
: If any name invalues
would cause a clash with the unknown value column created byunknown_suffix
or with another generated column.ValueError
: If novalues
were provided.ValueError
: If no name is provided and the reference in arg cannot be found.
Method generated by attrs for class OneHotEncodingTransformation.
Ancestors
Variables
- static
unknown_suffix : str
- static
values : dict[typing.Any, str]
columns : list[str]
- Lists the columns that will be output.
prefix
- Uses name as prefix or extract from arg (should be col or transform ref).
unknown_col : str
- Returns the name of the column that unknown values are encoded to.
Static methods
schema
def schema() ‑> marshmallow.schema.Schema:
Inherited from:
Gets an instance of the Schema associated with this Transformation.
Raises
TypeError
: If the transformation doesn't have aTransformationSchema
as the schema.
StringUnaryOperation
class StringUnaryOperation(*, name: str = None, output: bool = False, arg: str):
This class represents any UnaryOperation where arg can only be a string.
Arguments
arg
: The argument to the transformation as a string.name
: The name of the transformation. If not provided a unique name will be generated from the class name.output
: Whether or not this transformation should be included in the final output. Defaults to False.
Raises
TransformationRegistryError
: If the transformation name is already in use.TransformationRegistryError
: If the transformation name hasn't been provided and the transformation is not registered.
Method generated by attrs for class StringUnaryOperation.
Ancestors
Variables
- static
arg : str
Static methods
schema
def schema() ‑> marshmallow.schema.Schema:
Inherited from:
Gets an instance of the Schema associated with this Transformation.
Raises
TypeError
: If the transformation doesn't have aTransformationSchema
as the schema.
UnaryOperation
class UnaryOperation(*, name: str = None, output: bool = False, arg: Any):
The base abstract class for all Unary Operation Transformations.
Arguments
arg
: The argument to the transformation.name
: The name of the transformation. If not provided a unique name will be generated from the class name.output
: Whether or not this transformation should be included in the final output. Defaults to False.
Raises
TransformationRegistryError
: If the transformation name is already in use.TransformationRegistryError
: If the transformation name hasn't been provided and the transformation is not registered.
Method generated by attrs for class UnaryOperation.
Ancestors
Subclasses
Variables
- static
arg : Any
Static methods
schema
def schema() ‑> marshmallow.schema.Schema:
Inherited from:
Gets an instance of the Schema associated with this Transformation.
Raises
TypeError
: If the transformation doesn't have aTransformationSchema
as the schema.