bacpipe.model_pipelines package

Subpackages

Submodules

bacpipe.model_pipelines.model_utils module

class bacpipe.model_pipelines.model_utils.ModelBaseClass(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]

Bases: object

__init__(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]

This base class defines key methods and attributes for all feature extractors to ensure that we can use the same processing pipeline to generate embeddings. The idea is to

1. initialize the model with prepare_inference, thereby loading the model and loading it onto the selected device.

  1. load and resample audio to the sample rate required by the model

3. window the audio into segments corresponding to the required input segment length.

4. Calculating spectrograms (if the model architecture is accessible) to batch preprocess the audio and potentially be able to in retrospect build the spectrograms to investigate

5. Initialize a torch dataloader object based on the model specific audio loading characteristics to speed up the inference process and looping through the segments

  1. Perform batch inference

If ‘cuda’ has been selected as device, a threading approach is used to load data in parallel while performing inference. The return value are the embeddings.

Parameters:
  • sr (int) – sample rate

  • segment_length (int) – segment length in samples

  • device (str) – ‘cpu’ or ‘cuda’

  • model_base_path (pathlib.Path) – path to moin model checkpoint dir

  • global_batch_size (int) – global batch size that is then used in comjunction with the segment length to calculate a model-specific batch size that results in approximately equal batches for different models

prepare_inference()[source]
preprocessing(audio)[source]
bacpipe.model_pipelines.model_utils.check_if_cudnn_tensorflow_compatible()[source]

bacpipe.model_pipelines.runner module

class bacpipe.model_pipelines.runner.Classifier(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]

Bases: object

__init__(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]

Class to handle all tasks surrounding classification. Both generating the classifications from embeddings, as well as managing them, collecting them in arrays and creating dataframes and annotation tables from them.

Parameters:
  • model (Model object) – has attributes for all the model characteristics like sample rate, segment length etc. as well as the methods to run the model

  • model_name (str) – name of the model

  • classifier_threshold (float, optional) – Value under which class predictions are discarded, by default None

classify(embeddings)[source]
static filter_top_k_classifications(probabilities, class_names, class_indices, class_time_bins, k=50)[source]

Generate a dictionary with the top k classes. By limiting the class number to k, it prevents from this step taking too long but has the benefit of generating a dicitonary which can be saved as a .json file to quickly get a overview of species that are well represented within an audio file.

Parameters:
  • probabilities (np.array) – Probabilities for each class

  • class_names (list) – class names

  • class_indices (np.array) – class indices exceeding the threshold

  • class_time_bins (np.array) – time bin indices exceeding the threshold

  • k (int, optional) – number of classes to save in the dict. keep this below 100 otherwise the operation will start slowing the process down a lot, by default 50

Returns:

dictionary of top k classes with time bin indices exceeding threshold

Return type:

dict

static make_classification_dict(probabilities, classes, threshold)[source]
run_default_classifier(loader)[source]
save_Raven_table(file, relative_parent_path)[source]
save_annotation_table(loader_obj, **kwargs)[source]
save_classifier_outputs(fileloader_obj, file)[source]
class bacpipe.model_pipelines.runner.Embedder(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]

Bases: AudioHandler

This class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.

Parameters:

AudioHandler (class) – Helper class that handles loading of audio

__init__(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]

This class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.

Parameters:
  • model_name (str) – name of selected embedding model

  • loader (Loader object) – Object that has all the necessary path information and methods to load and save all the processed data

  • CustomModel (class, optional) – custom model class to use for processing, by default None

  • dim_reduction_model (bool, optional) – Can be bool or the string corresponding to the dimensionality reduction model, by default False

batch_inference(batched_samples, callback=None)[source]
embeddings_using_multithreading(array_of_audios)[source]

Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.

Parameters:

fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session

Returns:

updated object with metadata on embedding creation session

Return type:

Loader object

get_embeddings_for_audio(sample)[source]

Create a dataloader for the processed audio frames and run batch inference. Both are methods of the self.model class, which can be found in the utils.py file.

Parameters:

sample (torch.Tensor) – preprocessed audio frames

Returns:

embeddings from model

Return type:

np.array

get_embeddings_from_model(sample)[source]

Run full embedding generation pipeline, both for generating embeddings from audio data or generating dimensionality reductions from embedding data. Depending on that sample can be an embedding array or a audio file path.

Parameters:

sample (np.array or string-like) – embedding array of path to audio file

Returns:

embeddings

Return type:

np.array

get_reduced_dimensionality_embeddings(embeds)[source]
init_dataloader(audio)[source]
run_dimensionality_reduction_pipeline()[source]
run_inference_pipeline_sequentially()[source]
run_inference_pipeline_using_multithreading()[source]

Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.

Parameters:

fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session

Returns:

updated object with metadata on embedding creation session

Return type:

Loader object

Module contents