bacpipe.model_pipelines package

Subpackages

bacpipe.model_pipelines.feature_extractors package

Submodules

bacpipe.model_pipelines.model_utils module

class bacpipe.model_pipelines.model_utils.ModelBaseClass(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]

Bases: object

__init__(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]

This base class defines key methods and attributes for all feature extractors to ensure that we can use the same processing pipeline to generate embeddings. The idea is to

1. initialize the model with prepare_inference, thereby loading the model and loading it onto the selected device.

load and resample audio to the sample rate required by the model

3. window the audio into segments corresponding to the required input segment length.

4. Calculating spectrograms (if the model architecture is accessible) to batch preprocess the audio and potentially be able to in retrospect build the spectrograms to investigate

5. Initialize a torch dataloader object based on the model specific audio loading characteristics to speed up the inference process and looping through the segments

Perform batch inference

If ‘cuda’ has been selected as device, a threading approach is used to load data in parallel while performing inference. The return value are the embeddings.

Parameters:

sr (int) – sample rate
segment_length (int) – segment length in samples
device (str) – ‘cpu’ or ‘cuda’
model_base_path (pathlib.Path) – path to moin model checkpoint dir
global_batch_size (int) – global batch size that is then used in comjunction with the segment length to calculate a model-specific batch size that results in approximately equal batches for different models

prepare_inference()[source]

preprocessing(audio)[source]

bacpipe.model_pipelines.model_utils.check_if_cudnn_tensorflow_compatible()[source]

bacpipe.model_pipelines.runner module

class bacpipe.model_pipelines.runner.Classifier(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]

Bases: object

__init__(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]

Class to handle all tasks surrounding classification. Both generating the classifications from embeddings, as well as managing them, collecting them in arrays and creating dataframes and annotation tables from them.

Parameters:

model (Model object) – has attributes for all the model characteristics like sample rate, segment length etc. as well as the methods to run the model
model_name (str) – name of the model
classifier_threshold (float, optional) – Value under which class predictions are discarded, by default None

classify(embeddings)[source]

static filter_top_k_classifications(probabilities, class_names, class_indices, class_time_bins, k=50)[source]

Generate a dictionary with the top k classes. By limiting the class number to k, it prevents from this step taking too long but has the benefit of generating a dicitonary which can be saved as a .json file to quickly get a overview of species that are well represented within an audio file.

Parameters:

probabilities (np.array) – Probabilities for each class
class_names (list) – class names
class_indices (np.array) – class indices exceeding the threshold
class_time_bins (np.array) – time bin indices exceeding the threshold
k (int, optional) – number of classes to save in the dict. keep this below 100 otherwise the operation will start slowing the process down a lot, by default 50

Returns:

dictionary of top k classes with time bin indices exceeding threshold

Return type:

dict

static make_classification_dict(probabilities, classes, threshold)[source]

run_default_classifier(loader)[source]

save_Raven_table(file, relative_parent_path)[source]

save_annotation_table(loader_obj, **kwargs)[source]

save_classifier_outputs(fileloader_obj, file)[source]

class bacpipe.model_pipelines.runner.Embedder(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]

Bases: AudioHandler

This class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.

Parameters:: AudioHandler (class) – Helper class that handles loading of audio

__init__(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]

This class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.

Parameters:

model_name (str) – name of selected embedding model
loader (Loader object) – Object that has all the necessary path information and methods to load and save all the processed data
CustomModel (class, optional) – custom model class to use for processing, by default None
dim_reduction_model (bool, optional) – Can be bool or the string corresponding to the dimensionality reduction model, by default False

batch_inference(batched_samples, callback=None)[source]

embeddings_using_multithreading(array_of_audios)[source]

Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.

Parameters:: fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session
Returns:: updated object with metadata on embedding creation session
Return type:: Loader object

get_embeddings_for_audio(sample)[source]

Create a dataloader for the processed audio frames and run batch inference. Both are methods of the self.model class, which can be found in the utils.py file.

Parameters:: sample (torch.Tensor) – preprocessed audio frames
Returns:: embeddings from model
Return type:: np.array

get_embeddings_from_model(sample)[source]

Run full embedding generation pipeline, both for generating embeddings from audio data or generating dimensionality reductions from embedding data. Depending on that sample can be an embedding array or a audio file path.

Parameters:: sample (np.array or string-like) – embedding array of path to audio file
Returns:: embeddings
Return type:: np.array

get_reduced_dimensionality_embeddings(embeds)[source]

init_dataloader(audio)[source]

run_dimensionality_reduction_pipeline()[source]

run_inference_pipeline_sequentially()[source]

run_inference_pipeline_using_multithreading()[source]

Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.

Parameters:: fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session
Returns:: updated object with metadata on embedding creation session
Return type:: Loader object

bacpipe.model_pipelines package

Subpackages

Submodules

bacpipe.model_pipelines.model_utils module

bacpipe.model_pipelines.runner module

Module contents