Skip to main content

hugging_face_text_classification

Hugging Face Text Classification Algorithm.

Module

Functions

sigmoid

def sigmoid(_outputs: np.ndarray)> torch.Tensor:

Sigmoid function for output postprocessing.

softmax

def softmax(_outputs: np.ndarray)> torch.Tensor:

Softmax function for output postprocessing.

Classes

HuggingFaceTextClassificationInference

class HuggingFaceTextClassificationInference(    model_id: str,    target_column_name: str,    batch_size: int = 1,    function_to_apply: Optional[_FunctionToApply] = None,    seed: int = 42,    top_k: int = 1,):

Inference for pre-trained Hugging Face text classification models.

Arguments

  • batch_size: The batch size for inference. Defaults to 1.
  • function_to_apply: The function to apply to the model outputs in order to retrieve the scores. Accepts four different values: if this argument is not specified, then it will apply the following functions according to the number of labels - if the model has a single label, will apply the sigmoid function on the output; if the model has several labels, will apply the softmax function on the output. Possible values are:
  • "sigmoid": Applies the sigmoid function on the output.
  • "softmax": Applies the softmax function on the output.
  • "none": Does not apply any function on the output. Default to None.
  • model_id: The model id to use for text classification inference. The model id is of a pretrained model hosted inside a model repo on huggingface.co. Accepts resnet models.
  • seed: Sets the seed of the algorithm. For reproducible behavior it defaults to 42.
  • target_column_name: The target column on which the inference should be done.
  • top_k: The number of top labels that will be returned by the pipeline. Defaults to 1.

Attributes

  • batch_size: The batch size for inference. Defaults to 1.
  • class_name: The name of the algorithm class.
  • fields_dict: A dictionary mapping all attributes that will be serialized in the class to their marshamllow field type. (e.g. fields_dict = {"class_name": fields.Str()}).
  • function_to_apply: The function to apply to the model outputs in order to retrieve the scores. Accepts four different values: if this argument is not specified, then it will apply the following functions according to the number of labels - if the model has a single label, will apply the sigmoid function on the output; if the model has several labels, will apply the softmax function on the output. Possible values are:
  • "sigmoid": Applies the sigmoid function on the output.
  • "softmax": Applies the softmax function on the output.
  • "none": Does not apply any function on the output. Default to None.
  • model_id: The model id to use for text classification inference. The model id is of a pretrained model hosted inside a model repo on huggingface.co. Accepts resnet models.
  • nested_fields: A dictionary mapping all nested attributes to a registry that contains class names mapped to the respective classes. (e.g. nested_fields = {"datastructure": datastructure.registry})
  • seed: Sets the seed of the algorithm. For reproducible behavior it defaults to 42.
  • target_column_name: The target column on which the inference should be done.
  • top_k: The number of top labels that will be returned by the pipeline. Defaults to 1.

Ancestors

Variables

Methods


create

def create(self, role: Union[str, Role], **kwargs: Any)> Any:

Create an instance representing the role specified.

modeller

def modeller(    self, **kwargs: Any,)> bitfount.federated.algorithms.hugging_face_algorithms.base._HFModellerSide:

Returns the modeller side of the HuggingFaceTextClassificationInference algorithm.

worker

def worker(    self,    **kwargs: Any,)> bitfount.federated.algorithms.hugging_face_algorithms.hugging_face_text_classification._WorkerSide:

Returns the worker side of the HuggingFaceTextClassificationInference algorithm.