For Data Custodians
Once a project has been created, project owners will invite data custodians to join and connect their datasets. This page outlines the steps required to provision a dataset to a project and manage ongoing access.
Joining Projects
You will receive an email invitation from Bitfount when you are invited to a project. Before you join, you may want to consider the following:
- Review its description and additional metadata.
- Inspect the project’s task and any models by clicking the ‘View details’ link in the task card to ensure it matches your expectations.
- Carefully review the project's terms and conditions.
If you are unfamiliar with the scope of the project, you may wish to consult with your legal team or the project point contact before proceeding.
Finally, when you’re happy to continue, tick the checkbox to agree to the terms (if present) and click "Join Project".
Connecting Datasets
Now that you've joined a project, you're ready to connect your dataset to the Bitfount platform.
Data Formatting
When preparing data for use with Bitfount Desktop, it's important to ensure the format of your dataset is compatible with the tasks assigned to the project. Please check in with your project contact to ensure your dataset meets the requirements for a successful task run.
Selecting a Data Source and Metadata
To initiate the connection of a dataset click the “Connect Dataset” button. You will be presented with a range of icons highlighting the different source formats Bitfount supports.
The application will prompt you to enter the required setup and metadata for your source type. For more advanced sources, such as connections to databases, you may wish to reach out to your IT department or the Bitfount team for support.
Once you’ve entered the required metadata fields, click "Connect Dataset". The system will then process the dataset and generate a schema. A schema is essentially the column names and category types of data contained in the dataset and is used by the system to detect if your dataset is compatible with a project’s task.
Once connected, the dataset should appear with at an “Online” status.
Datasets remain on the data custodian’s systems at all times and are not transferred or stored by Bitfount or any other collaborator’s systems unless explicitly agreed. We only store the dataset description, schema and audit histories of tasks performed against the data.
Linking Datasets to Projects
Once you have connected a dataset, you need to link it to the project by navigating to the project you have joined and clicking the “Link Dataset” button.
After selecting the chosen dataset, Bitfount will then run a process to check if the data schema is compatible with the task. Once this is complete, you will see your linked dataset within the project on the right-hand side.
If the schema checker returns an error, please review the schema and ensure that the expected columns are present in the data and named accordingly.
Running Tasks
Once you've linked a dataset you're ready to run the project's provisioned task(s).
On Bitfount, the local service that links to the dataset and manages the execution of analysis tasks is known as the Processor of Data (Pod). When an analysis is requested, the Pod communicates with the Access Manager to ensure that the requesting user is authenticated, and only allows analyses to run if the user has been authorised. This process occurs autonomously, without the need of manual user intervention.
To run a task click the "Run Task" button on the right-hand side of the project screen.
- Wait for the task to complete. You can see task progress in the status bar on the side of your screen. Task completion times will depend on the type and size of the dataset, complexity of the task, and available compute resources on your machine.
- Once the task is complete, if appropriate, you can retrieve the result by navigating to the “Activity History” tab and select the link to the "Results" link. This will take you to the location of the task's output files.
Interpreting Results
The output generated from a successful task run will vary based on the algorithm used in the task. This could take the form of a PDF report, CSV file or other formats depending on the goal of running the task. For more details on how to interpret results, please reach out to your project contact.
Managing Datasets
Activity History
A full audit trail is available for datasets via the “Activity history” tab. To view only project specific activity navigate to the same tab on the project’s card.
Status
Upon starting up Bitfount Desktop the system will attempt to bring all available datasets “Online”, tasks can only be run once this is complete. Within a dataset’s “Settings” tab you can choose to take the dataset offline. When a dataset is offline no tasks can be run against it.
Deletion and Archiving
From the “Settings” tab on a dataset page, the dataset can either be a dataset can either be deleted or archived. Deletion is a permanent action which disconnects the dataset from the Bitfount network. It does not delete the raw data source. Archived datasets can be unarchived and reused in projects when appropriate.
Access
On the dataset card you can view all projects the dataset is currently linked to. Unlinking a dataset from a project can be completed at any time by clicking the “unlink dataset” button within the project.
If linking your dataset to projects doesn’t fit your use case, please see Managing Pod Access. This guide outlines how to manage direct access to datasets outside of the context of a project via the ‘Assigned roles’ tab.
Need Help?
If you have any questions after reviewing our Guides and Tutorials, visit our FAQs. If you can't find the answer you are looking for or would like to discuss anything further, please contact us at support@bitfount.com. We're here to help!