Skip to main content

utils

Utility functions for HuggingFace data.

Module

Functions

get_data_factory_dataset

def get_data_factory_dataset(    datasource: BaseSource,    data_split: DataSplit,    selected_cols: list[str],    selected_cols_semantic_types: Mapping[_SemanticTypeValue, list[str]],    batch_transforms: Optional[list[dict[str, _JSONDict]]],    labels2id: Optional[dict[str, int]] = None,    target: Optional[Union[str, list[str]]] = None,)> tuple[_BaseHuggingFaceDataFactory, Union[_IterableHuggingFaceDataset, _HuggingFaceDataset]]:

Get the HuggingFace data factory and dataset for the given datasource.