Submodules


core.workflows

core.experiment_manager

core.audio_processor

class bacpipe.core.audio_processor.AudioHandler(model, padding, audio_dir, bool_slowdown=False, slowdown_rate=None, **kwargs)[source]

Bases: object

Helper class for all methods related to loading and padding audio.

__init__(model, padding, audio_dir, bool_slowdown=False, slowdown_rate=None, **kwargs)[source]

Helper class for all methods related to loading and padding audio.

Parameters:
  • model (Model object) – has attributes for all the model characteristics like sample rate, segment length etc. as well as the methods to run the model

  • padding (str) – padding function to use for where padding is necessary

  • audio_dir (pathlib.Path object) – path to audio dir

prepare_audio(sample)[source]

Use bacpipe pipeline to load audio file, window it according to model specific window length and preprocess the data, ready for batch inference computation. Also log file length and shape for metadata files.

Parameters:

sample (pathlib.Path or str) – path to audio file

Returns:

audio frames preprocessed with model specific preprocessing

Return type:

torch.Tensor

model_pipelines.runner

model_pipelines.model_utils

embedding_evaluation

probing

class bacpipe.embedding_evaluation.probing.dataset_probe.ProbeDatasetLoader(class_df, embeds, label2index, set_name=None, **kwargs)[source]

Bases: Dataset

__getitem__(idx)[source]

Iterate through dataset.

Parameters:

idx (int) – index of training step

Returns:

(embedding, true label)

Return type:

tuple

__init__(class_df, embeds, label2index, set_name=None, **kwargs)[source]

Class to initialize and iterate through classification dataset.

Parameters:
  • class_df (pd.DataFrame) – classification dataframe

  • embeds (np.array) – embeddings

  • label2index (dict) – linking labels to integers

  • set_name (string, optional) – train, test or val set, by default None

bacpipe.embedding_evaluation.probing.dataset_probe.generate_annotations_for_probing_task(ground_truth, paths, label_column, dataset_csv_path='probe_annotations.csv', train_ratio=None, test_ratio=None, **kwargs)[source]
bacpipe.embedding_evaluation.probing.dataset_probe.probe_dataset_loader(set_name, clean_df, embeds, label2index, batch_size=64, shuffle=False, **kwargs)[source]

Create dataset loader object for classification.

Parameters:
  • set_name (string) – train, test of val set

  • clean_df (pd.DataFrame) – classification dataframe

  • embeds (np.array) – embeddings

  • label2index (dict) – link labels to ints

  • batch_size (int, optional) – number of embeddings per batch, by default 64

  • shuffle (bool, optional) – shuffle or not, by default False

Returns:

dataset loader object to iterate over during training

Return type:

DataLoader obj

cluster

benchmarking

label_embeddings

visualization

test_embedding_creation