algorithms
Algorithms for remote processing of data.
Federated algorithm plugins can also be imported from this package.
Module
Submodules
- bitfount.federated.algorithms.base - Base classes for all algorithms.
- bitfount.federated.algorithms.csv_report_algorithm - Algorithm for outputting results to CSV on the pod-side.
- bitfount.federated.algorithms.hugging_face_algorithms - Algorithms for remote Hugging Face models.
- bitfount.federated.algorithms.model_algorithms - Algorithms for remote/federated model training on data.
- bitfount.federated.algorithms.ophthalmology - Algorithm plugins in this package.
- bitfount.federated.algorithms.private_sql_query - Private SQL query algorithm.
- bitfount.federated.algorithms.sql_query - SQL query algorithm.
Classes
BaseAlgorithmFactory
class BaseAlgorithmFactory(**kwargs: Any):
Base algorithm factory from which all other algorithms must inherit.
Attributes
class_name
: The name of the algorithm class.
Subclasses
- BaseNonModelAlgorithmFactory
- bitfount.federated.algorithms.model_algorithms.base._BaseModelAlgorithmFactory
Variables
- static
fields_dict : ClassVar[dict[str, marshmallow.fields.Field]]
- static
nested_fields : ClassVar[dict[str, collections.abc.Mapping[str, Any]]]
BaseNonModelAlgorithmFactory
class BaseNonModelAlgorithmFactory(*, datastructure: DataStructure, **kwargs: Any):
Base factory for algorithms not involving an underlying model.
Arguments
datastructure
: The data structure to use for the algorithm.- **
**kwargs
**: Additional keyword arguments.
Attributes
datastructure
: The data structure to use for the algorithm.
Ancestors
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Subclasses
- CSVReportAlgorithm
- HuggingFaceImageClassificationInference
- HuggingFaceImageSegmentationInference
- HuggingFacePerplexityEvaluation
- HuggingFaceTextClassificationInference
- HuggingFaceTextGenerationInference
- TIMMFineTuning
- TIMMInference
- CSVReportGeneratorOphthalmologyAlgorithm
- ETDRSAlgorithm
- FoveaCoordinatesAlgorithm
- GATrialCalculationAlgorithmBronze
- GATrialCalculationAlgorithmJade
- TrialInclusionCriteriaMatchAlgorithmAmethyst
- TrialInclusionCriteriaMatchAlgorithmBronze
- TrialInclusionCriteriaMatchAlgorithmJade
- GATrialPDFGeneratorAlgorithmAmethyst
- GATrialPDFGeneratorAlgorithmJade
- bitfount.federated.algorithms.ophthalmology.simple_csv_algorithm._SimpleCSVAlgorithm
- PrivateSqlQuery
- SqlQuery
Variables
- static
fields_dict : ClassVar[dict[str, marshmallow.fields.Field]]
- static
nested_fields : ClassVar[dict[str, collections.abc.Mapping[str, Any]]]
CSVReportAlgorithm
class CSVReportAlgorithm( datastructure: DataStructure, save_path: Optional[Union[str, os.PathLike]] = None, original_cols: Optional[list[str]] = None, filter: Optional[list[ColumnFilter]] = None, **kwargs: Any,):
Algorithm for generating the CSV results reports.
Arguments
datastructure
: The data structure to use for the algorithm.save_path
: The folder path where the csv report should be saved. The CSV report will have the same name as the taskID.original_cols
: The tabular columns from the datasource to include in the report. If not specified it will include all tabular columns from the datasource.filter
: A list ofColumnFilter
instances on which we will filter the data on. Defaults to None. If supplied, columns will be added to the output csv indicating the records that match the specified criteria. If more than oneColumnFilter
is given, and additional column will be added to the output csv indicating the datapoints that match all given criteria (as well as the individual matches)
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.csv_report_algorithm._WorkerSide:
Worker-side of the algorithm.
CSVReportGeneratorOphthalmologyAlgorithm
class CSVReportGeneratorOphthalmologyAlgorithm( datastructure: DataStructure, save_path: Optional[Union[str, os.PathLike]] = None, trial_name: Optional[str] = None, rename_columns: Optional[Mapping[str, str]] = None, original_cols: Optional[list[str]] = None, filter: Optional[list[ColumnFilter]] = None, match_patient_visit: Optional[MatchPatientVisit] = None, matched_csv_path: Optional[Union[str, os.PathLike]] = None, produce_matched_only: bool = True, csv_extensions: Optional[list[str]] = None, produce_trial_notes_csv: bool = False, sorting_columns: Optional[dict[str, DFSortType]] = None, **kwargs: Any,):
Algorithm for generating the CSV results reports.
Arguments
datastructure
: The data structure to use for the algorithm.save_path
: The folder path where the csv report should be saved.trial_name
: The name of the trial for the csv report. If provided, the CSV will be saved as "trial_name"-prescreening-patients-"date".csv. Defaults to None.original_cols
: The tabular columns from the datasource to include in the report. If not specified it will include all tabular columns from the datasource.rename_columns
: A dictionary of columns to rename. Defaults to None.filter
: A list ofColumnFilter
instances on which we will filter the data on. Defaults to None. If supplied, columns will be added to the output csv indicating the records that match the specified criteria. If more than oneColumnFilter
is given, and additional column will be added to the output csv indicating the datapoints that match all given criteria (as well as the individual matches)match_patient_visit
: Used for matching the same patient visit.matched_csv_path
: Path to save the matched patients CSV to, if requested. Defaults tosave_path
(i.e. overwrites the non-matched CSV) ifproduce_matched_only
is True. Otherwise, will create a file based off of thesave_path
argument.produce_matched_only
: If True, only the matched CSV will be generated at the end of the run. If False, both the non-matched and matched CSV will be generated.produce_trial_notes_csv
: If True, a CSV file containing the trial notes will be generated at the end of the run. Defaults to False.csv_extensions
: List of named CSV extension functions that will be applied to the output CSV just before saving to file.sorting_columns
: A dictionary of columns to sort the output CSV by. The keys are column names the values are either 'asc' or 'desc'. Defaults to None.
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Subclasses
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.ophthalmology.csv_report_generation_ophth_algorithm._WorkerSide:
Worker-side of the algorithm.
ETDRSAlgorithm
class ETDRSAlgorithm( datastructure: DataStructure, laterality: str, slo_photo_location_prefixes: Optional[SLOSegmentationLocationPrefix] = None, slo_image_metadata_columns: Optional[SLOImageMetadataColumns] = None, oct_image_metadata_columns: Optional[OCTImageMetadataColumns] = None, threshold: float = 0.7, calculate_on_oct: bool = False, slo_mm_width: float = 8.8, slo_mm_height: float = 8.8, **kwargs: Any,):
Algorithm for computing ETDRS subfields.
Arguments
datastructure
: The data structure to use for the algorithm.laterality
: The column name of the column that contains the laterality of the scans.oct_image_metadata_columns
: A list of column names for the OCT image. Should include the width and depth size in mm. Defaults to None.slo_photo_location_prefixes
: The list of column names for the locations of the OCT segmentation on the SLO. Should include the location and end of the first image on both x and y-axis as well as the start location of the last image on both x and y-axis. Defaults to None.slo_image_metadata_columns
: A list of column names for the SLO image. Should include the width and height in mm. Defaults to None.threshold
: The threshold for the segmentation. Defaults to None.
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.ophthalmology.etdrs_calculation_algorithm._WorkerSide:
Worker-side of the algorithm.
FederatedModelTraining
class FederatedModelTraining( *, model: _DistributedModelTypeOrReference, modeller_checkpointing: bool = True, checkpoint_filename: Optional[str] = None, pretrained_file: Optional[Union[str, os.PathLike]] = None, project_id: Optional[str] = None,):
Algorithm for training a model remotely and returning its updated parameters.
This algorithm is designed to be compatible with the FederatedAveraging
protocol.
Arguments
model
: The model to train on remote data.pretrained_file
: A file path or a string containing a pre-trained model. Defaults to None.
Attributes
checkpoint_filename
: The filename for the last checkpoint. Defaults to the task id and the last iteration number, i.e.,{taskid}-iteration-{iteration_number}.pt
.class_name
: The name of the algorithm class.fields_dict
: A dictionary mapping all attributes that will be serialized in the class to their marshamllow field type. (e.g. fields_dict ={"class_name": fields.Str()}
).model
: The model to train on remote data.modeller_checkpointing
: Whether to save the last checkpoint on the modeller side. Defaults to True.nested_fields
: A dictionary mapping all nested attributes to a registry that contains class names mapped to the respective classes. (e.g. nested_fields ={"datastructure": datastructure.registry}
)pretrained_file
: A file path or a string containing a pre-trained model. Defaults to None.
Ancestors
- bitfount.federated.algorithms.model_algorithms.base._BaseModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
nested_fields : ClassVar[dict[str, collections.abc.Mapping[str, Any]]]
Methods
create
def create(self, role: Union[str, Role], **kwargs: Any) ‑> Any:
Create an instance representing the role specified.
modeller
def modeller( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.model_algorithms.federated_training._ModellerSide:
Returns the modeller side of the FederatedModelTraining algorithm.
worker
def worker( self, hub: BitfountHub, **kwargs: Any,) ‑> bitfount.federated.algorithms.model_algorithms.federated_training._WorkerSide:
Returns the worker side of the FederatedModelTraining algorithm.
Arguments
hub
:BitfountHub
object to use for communication with the hub.- **
**kwargs
**: Additional keyword arguments to pass to the worker side.
Returns Worker side of the FederatedModelTraining algorithm.
FoveaCoordinatesAlgorithm
class FoveaCoordinatesAlgorithm( datastructure: DataStructure, bscan_width_col: str = 'size_width', location_prefixes: Optional[SLOSegmentationLocationPrefix] = None, **kwargs: Any,):
Computes the Fovea coordinates from the Fovea detection model predictions.
Arguments
datastructure
: The data structure to use for the algorithm.bscan_width_col
: The column name that contains the bscan width. Defaults to "size_width".location_prefixes
: A dataclass that contains the prefixes for the start and end of the images along both X and Y axis.
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.ophthalmology.fovea_coordinates_algorithm._WorkerSide:
Worker-side of the algorithm.
GATrialCalculationAlgorithmBronze
class GATrialCalculationAlgorithmBronze( datastructure: DataStructure, ga_area_include_segmentations: Optional[list[str]] = None, ga_area_exclude_segmentations: Optional[list[str]] = None, fovea_landmark_idx: int = 1, **kwargs: Any,):
Algorithm for calculating the GA Area and associated metrics.
Arguments
ga_area_include_segmentations
: List of segmentation labels to be used for calculating the GA area. The logical AND of the masks for these labels will be used to calculate the GA area. If not provided, the default inclusion labels for the GA area will be used.ga_area_exclude_segmentations
: List of segmentation labels to be excluded from calculating the GA area. If any of these segmentations are present in the axial segmentation masks, that axis will be excluded from the GA area calculation. If not provided, the default exclusion labels for the GA area will be used.fovea_landmark_idx
: index of the fovea landmark in the tuple. 0 for fovea start, 1 for fovea middle, 2 for fovea end. Default is 1.
Raises
ValueError
: If an invalid segmentation label is provided.ValueError
: If a segmentation label is provided in both the include and exclude lists.
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.ophthalmology.ga_trial_calculation_algorithm_bronze._WorkerSide:
Worker-side of the algorithm.
GATrialCalculationAlgorithmJade
class GATrialCalculationAlgorithmJade( datastructure: DataStructure, ga_area_include_segmentations: Optional[list[str]] = None, ga_area_exclude_segmentations: Optional[list[str]] = None, **kwargs: Any,):
Algorithm for calculating the GA Area and associated metrics.
Arguments
datastructure
: The data structure to use for the algorithm.ga_area_include_segmentations
: List of segmentation labels to be used for calculating the GA area. The logical AND of the masks for these labels will be used to calculate the GA area. If not provided, the default inclusion labels for the GA area will be used.ga_area_exclude_segmentations
: List of segmentation labels to be excluded from calculating the GA area. If any of these segmentations are present in the axial segmentation masks, that axis will be excluded from the GA area calculation. If not provided, the default exclusion labels for the GA area will be used.
Raises
ValueError
: If an invalid segmentation label is provided.ValueError
: If a segmentation label is provided in both the include and exclude lists.
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Subclasses
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: