bacpipe.embedding_evaluation.probing.evaluate_probe

Functions

accuracy_per_class(y_true, y_pred, ...)

Accuracy per class

auc(y_true, probability_scores)

Compute the AUC

compute_task_metrics(y_pred, y_true, ...)

Compute the evaluation metrics

eval_probe(probe, embeds, df, label2index[, ...])

Perform inference using probe.

macro_accuracy(y_true, y_pred)

Compute macro accuracy.

macro_f1(y_true, y_pred)

Compute the macro f1 score

micro_accuracy(y_true, y_pred)

micro_f1(y_true, y_pred)

Compute the micro f1 score

plot_classification_results(task_name[, ...])

Save model specific classification results in the model specific plot path, displayed as horizontal bars.

probe_dataset_loader(set_name, clean_df, ...)

Create dataset loader object for classification.

save_probe_results(paths, config, metrics, ...)

Save a dict with all performance metrics.

Classes

Path(*args, **kwargs)

PurePath subclass that can make system calls.

bacpipe.embedding_evaluation.probing.evaluate_probe.accuracy_per_class(y_true, y_pred, label2index, items_per_class)[source]

Accuracy per class

Parameters:
  • y_true (list) – ground truth

  • y_pred (list) – predictions

  • label2index (dict) – link labels to ints

  • items_per_class (list) – number of items per class

Returns:

classwise accuracy

Return type:

dict

bacpipe.embedding_evaluation.probing.evaluate_probe.auc(y_true, probability_scores)[source]

Compute the AUC

bacpipe.embedding_evaluation.probing.evaluate_probe.compute_task_metrics(y_pred, y_true, probability_scores, label2index)[source]

Compute the evaluation metrics

bacpipe.embedding_evaluation.probing.evaluate_probe.eval_probe(probe, embeds, df, label2index, device='cuda:0', config='linear', paths=None, save_probe=False, **kwargs)[source]

Perform inference using probe.

Parameters:
  • probe (object) – trained classification object

  • test_dataloader (DataLoader object) – dataset iterator

  • device (str, optional) – ‘cpu’ or ‘cuda’, by default “cuda:0”

  • config (str, optional) – type of classification, by default “linear”

Returns:

  • list – prediction values in ints corresponding to labels

  • list – ground truth values in ints

  • np.array – probabilities for each class and each embedding

bacpipe.embedding_evaluation.probing.evaluate_probe.macro_accuracy(y_true, y_pred)[source]

Compute macro accuracy.

Parameters:
  • y_true (list) – ground truth

  • y_pred (list) – predictions

Returns:

balance accuracy score

Return type:

float

bacpipe.embedding_evaluation.probing.evaluate_probe.macro_f1(y_true, y_pred)[source]

Compute the macro f1 score

bacpipe.embedding_evaluation.probing.evaluate_probe.micro_accuracy(y_true, y_pred)[source]
bacpipe.embedding_evaluation.probing.evaluate_probe.micro_f1(y_true, y_pred)[source]

Compute the micro f1 score

bacpipe.embedding_evaluation.probing.evaluate_probe.save_probe_results(paths, config, metrics, **kwargs)[source]

Save a dict with all performance metrics.

Parameters:
  • paths (SimpleNamespace object) – dict with attributs of paths for loading and saving

  • config (string) – type of classification (linear or knn)

  • metrics (dict) – performance