bacpipe.embedding_evaluation.probing package

Submodules

bacpipe.embedding_evaluation.probing.dataset_probe module

class bacpipe.embedding_evaluation.probing.dataset_probe.ProbeDatasetLoader(class_df, embeds, label2index, set_name=None, **kwargs)[source]

Bases: Dataset

__getitem__(idx)[source]

Iterate through dataset.

Parameters:

idx (int) – index of training step

Returns:

(embedding, true label)

Return type:

tuple

__init__(class_df, embeds, label2index, set_name=None, **kwargs)[source]

Class to initialize and iterate through classification dataset.

Parameters:
  • class_df (pd.DataFrame) – classification dataframe

  • embeds (np.array) – embeddings

  • label2index (dict) – linking labels to integers

  • set_name (string, optional) – train, test or val set, by default None

bacpipe.embedding_evaluation.probing.dataset_probe.generate_annotations_for_probing_task(ground_truth, paths, label_column, dataset_csv_path='probe_annotations.csv', train_ratio=None, test_ratio=None, **kwargs)[source]
bacpipe.embedding_evaluation.probing.dataset_probe.probe_dataset_loader(set_name, clean_df, embeds, label2index, batch_size=64, shuffle=False, **kwargs)[source]

Create dataset loader object for classification.

Parameters:
  • set_name (string) – train, test of val set

  • clean_df (pd.DataFrame) – classification dataframe

  • embeds (np.array) – embeddings

  • label2index (dict) – link labels to ints

  • batch_size (int, optional) – number of embeddings per batch, by default 64

  • shuffle (bool, optional) – shuffle or not, by default False

Returns:

dataset loader object to iterate over during training

Return type:

DataLoader obj

bacpipe.embedding_evaluation.probing.evaluate_probe module

bacpipe.embedding_evaluation.probing.inference_probe module

bacpipe.embedding_evaluation.probing.probe module

bacpipe.embedding_evaluation.probing.train_probe module

Module contents