bacpipe.embedding_evaluation.probing.dataset_probe
Functions
|
|
|
Create dataset loader object for classification. |
Classes
|
Data loader combines a dataset and a sampler, and provides an iterable over the given dataset. |
|
An abstract class representing a |
|
PurePath subclass that can make system calls. |
|
- class bacpipe.embedding_evaluation.probing.dataset_probe.ProbeDatasetLoader(class_df, embeds, label2index, set_name=None, **kwargs)[source]
Bases:
Dataset- __getitem__(idx)[source]
Iterate through dataset.
- Parameters:
idx (int) – index of training step
- Returns:
(embedding, true label)
- Return type:
tuple
- __init__(class_df, embeds, label2index, set_name=None, **kwargs)[source]
Class to initialize and iterate through classification dataset.
- Parameters:
class_df (pd.DataFrame) – classification dataframe
embeds (np.array) – embeddings
label2index (dict) – linking labels to integers
set_name (string, optional) – train, test or val set, by default None
- bacpipe.embedding_evaluation.probing.dataset_probe.generate_annotations_for_probing_task(ground_truth, paths, label_column, dataset_csv_path='probe_annotations.csv', train_ratio=None, test_ratio=None, **kwargs)[source]
- bacpipe.embedding_evaluation.probing.dataset_probe.probe_dataset_loader(set_name, clean_df, embeds, label2index, batch_size=64, shuffle=False, **kwargs)[source]
Create dataset loader object for classification.
- Parameters:
set_name (string) – train, test of val set
clean_df (pd.DataFrame) – classification dataframe
embeds (np.array) – embeddings
label2index (dict) – link labels to ints
batch_size (int, optional) – number of embeddings per batch, by default 64
shuffle (bool, optional) – shuffle or not, by default False
- Returns:
dataset loader object to iterate over during training
- Return type:
DataLoader obj