bacpipe.model_pipelines package
Subpackages
- bacpipe.model_pipelines.feature_extractors package
- Submodules
- bacpipe.model_pipelines.feature_extractors.audiomae module
- bacpipe.model_pipelines.feature_extractors.audioprotopnet module
- bacpipe.model_pipelines.feature_extractors.aves_especies module
- bacpipe.model_pipelines.feature_extractors.avesecho_passt module
- bacpipe.model_pipelines.feature_extractors.bat module
- bacpipe.model_pipelines.feature_extractors.batdetect2_clip_avg module
- bacpipe.model_pipelines.feature_extractors.batdetect2_dets_avg module
- bacpipe.model_pipelines.feature_extractors.beats module
- bacpipe.model_pipelines.feature_extractors.biolingual module
- bacpipe.model_pipelines.feature_extractors.birdaves_especies module
- bacpipe.model_pipelines.feature_extractors.birdmae module
- bacpipe.model_pipelines.feature_extractors.birdnet module
- bacpipe.model_pipelines.feature_extractors.convnext_birdset module
- bacpipe.model_pipelines.feature_extractors.google_whale module
- bacpipe.model_pipelines.feature_extractors.hbdet module
- bacpipe.model_pipelines.feature_extractors.insect459 module
- bacpipe.model_pipelines.feature_extractors.insect66 module
- bacpipe.model_pipelines.feature_extractors.mix2 module
- bacpipe.model_pipelines.feature_extractors.naturebeats module
- bacpipe.model_pipelines.feature_extractors.perch_bird module
- bacpipe.model_pipelines.feature_extractors.perch_v2 module
- bacpipe.model_pipelines.feature_extractors.protoclr module
- bacpipe.model_pipelines.feature_extractors.rcl_fs_bsed module
- bacpipe.model_pipelines.feature_extractors.surfperch module
- bacpipe.model_pipelines.feature_extractors.vggish module
- Module contents
Submodules
bacpipe.model_pipelines.model_utils module
- class bacpipe.model_pipelines.model_utils.ModelBaseClass(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]
Bases:
object- __init__(sr, segment_length, model_name, device=None, model_base_path=None, global_batch_size=None, dim_reduction_model=False, **kwargs)[source]
This base class defines key methods and attributes for all feature extractors to ensure that we can use the same processing pipeline to generate embeddings. The idea is to
1. initialize the model with prepare_inference, thereby loading the model and loading it onto the selected device.
load and resample audio to the sample rate required by the model
3. window the audio into segments corresponding to the required input segment length.
4. Calculating spectrograms (if the model architecture is accessible) to batch preprocess the audio and potentially be able to in retrospect build the spectrograms to investigate
5. Initialize a torch dataloader object based on the model specific audio loading characteristics to speed up the inference process and looping through the segments
Perform batch inference
If ‘cuda’ has been selected as device, a threading approach is used to load data in parallel while performing inference. The return value are the embeddings.
- Parameters:
sr (int) – sample rate
segment_length (int) – segment length in samples
device (str) – ‘cpu’ or ‘cuda’
model_base_path (pathlib.Path) – path to moin model checkpoint dir
global_batch_size (int) – global batch size that is then used in comjunction with the segment length to calculate a model-specific batch size that results in approximately equal batches for different models
bacpipe.model_pipelines.runner module
- class bacpipe.model_pipelines.runner.Classifier(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]
Bases:
object- __init__(model, model_name, audio_dir, main_results_dir, classifier_threshold, use_folder_structure=True, save_raven_tables=False, **kwargs)[source]
Class to handle all tasks surrounding classification. Both generating the classifications from embeddings, as well as managing them, collecting them in arrays and creating dataframes and annotation tables from them.
- Parameters:
model (Model object) – has attributes for all the model characteristics like sample rate, segment length etc. as well as the methods to run the model
model_name (str) – name of the model
classifier_threshold (float, optional) – Value under which class predictions are discarded, by default None
- static filter_top_k_classifications(probabilities, class_names, class_indices, class_time_bins, k=50)[source]
Generate a dictionary with the top k classes. By limiting the class number to k, it prevents from this step taking too long but has the benefit of generating a dicitonary which can be saved as a .json file to quickly get a overview of species that are well represented within an audio file.
- Parameters:
probabilities (np.array) – Probabilities for each class
class_names (list) – class names
class_indices (np.array) – class indices exceeding the threshold
class_time_bins (np.array) – time bin indices exceeding the threshold
k (int, optional) – number of classes to save in the dict. keep this below 100 otherwise the operation will start slowing the process down a lot, by default 50
- Returns:
dictionary of top k classes with time bin indices exceeding threshold
- Return type:
dict
- class bacpipe.model_pipelines.runner.Embedder(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]
Bases:
AudioHandlerThis class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.
- Parameters:
AudioHandler (class) – Helper class that handles loading of audio
- __init__(model_name, loader=None, CustomModel=None, dim_reduction_model=False, **kwargs)[source]
This class takes care of loading the specified model and using it to process the audio data to create embeddings. This class is also used to create dimensinoality reductions from embeddings. At the end if instantiation, the selected model is loaded and the model is associated with the specified device. kwargs that are not specifically passed will be taken from bacpipe.config and bacpipe.settings.
- Parameters:
model_name (str) – name of selected embedding model
loader (Loader object) – Object that has all the necessary path information and methods to load and save all the processed data
CustomModel (class, optional) – custom model class to use for processing, by default None
dim_reduction_model (bool, optional) – Can be bool or the string corresponding to the dimensionality reduction model, by default False
- embeddings_using_multithreading(array_of_audios)[source]
Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.
- Parameters:
fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session
- Returns:
updated object with metadata on embedding creation session
- Return type:
Loader object
- get_embeddings_for_audio(sample)[source]
Create a dataloader for the processed audio frames and run batch inference. Both are methods of the self.model class, which can be found in the utils.py file.
- Parameters:
sample (torch.Tensor) – preprocessed audio frames
- Returns:
embeddings from model
- Return type:
np.array
- get_embeddings_from_model(sample)[source]
Run full embedding generation pipeline, both for generating embeddings from audio data or generating dimensionality reductions from embedding data. Depending on that sample can be an embedding array or a audio file path.
- Parameters:
sample (np.array or string-like) – embedding array of path to audio file
- Returns:
embeddings
- Return type:
np.array
- run_inference_pipeline_using_multithreading()[source]
Generate embeddings for all files in a pipelined manner: - Producer thread loads and preprocesses audio - Consumer (main thread) embeds audio while producer prepares next batch Ensures metadata and embeddings are written exactly like in the sequential version.
- Parameters:
fileloader_obj (Loader object) – contains all metadata of a model specific embedding creation session
- Returns:
updated object with metadata on embedding creation session
- Return type:
Loader object