Command line interface

birdnet_analyzer.analyze

Run analyzer.py to analyze an audio file. You need to set paths for the audio file and selection table output. Here is an example:

python -m birdnet_analyzer.analyze /path/to/audio/folder -o /path/to/output/folder

Here are two example commands to run this BirdNET version:

python3 -m birdnet_analyzer.analyze example/ --slist example/ --min_conf 0.5 --threads 4

python3 -m birdnet_analyzer.analyze example/ --lat 42.5 --lon -76.45 --week 4 --sensitivity 1.0

usage: birdnet_analyzer.analyze [-h] [-o OUTPUT] [--fmin FMIN] [--fmax FMAX]
                                [--lat LAT] [--lon LON] [--week WEEK]
                                [--slist SLIST] [--sf_thresh SF_THRESH]
                                [--sensitivity SENSITIVITY]
                                [--overlap OVERLAP]
                                [--audio_speed AUDIO_SPEED] [-t THREADS]
                                [--min_conf MIN_CONF] [-l LOCALE]
                                [-b BATCHSIZE]
                                [--rtype {table,audacity,kaleidoscope,csv} [{table,audacity,kaleidoscope,csv} ...]]
                                [--combine_results] [-c CLASSIFIER]
                                [--skip_existing_results] [--top_n TOP_N]
                                [--merge_consecutive MERGE_CONSECUTIVE]
                                INPUT

Positional Arguments

INPUT: Path to input file or folder.

Named Arguments

-o, --output

Path to output folder. Defaults to the input path.

--fmin

Minimum frequency for bandpass filter in Hz.

Default: 0

--fmax

Maximum frequency for bandpass filter in Hz.

Default: 15000

--lat

Recording location latitude. Set -1 to ignore.

Default: -1

--lon

Recording location longitude. Set -1 to ignore.

Default: -1

--week

Week of the year when the recording was made. Values in [1, 48] (4 weeks per month). Set -1 for year-round species list.

Default: -1

--slist

Path to species list file or folder. If folder is provided, species list needs to be named “species_list.txt”. If lat and lon are provided, this list will be ignored.

--sf_thresh

Minimum species occurrence frequency threshold for location filter. Values in [0.0001, 0.99].

Default: 0.03

--sensitivity

Detection sensitivity; Higher values result in higher sensitivity. Values in [0.75, 1.25]. Values other than 1.0 will shift the sigmoid functionon the x-axis. Use complementary to the cut-off threshold.

Default: 1.0

--overlap

Overlap of prediction segments. Values in [0.0, 2.9].

Default: 0

--audio_speed

Speed factor for audio playback. Values < 1.0 will slow down the audio, values > 1.0 will speed it up. At a 10x decrease (audio speed 0.1), a 384 kHz recording becomes a 38.4 kHz recording.

Default: 1.0

-t, --threads

Number of CPU threads.

Default: 2

--min_conf

Minimum confidence threshold. Values in [0.00001, 0.99].

Default: 0.25

-l, --locale

Locale for translated species common names. Values in [‘af’, ‘en_UK’, ‘de’, ‘it’, …].

Default: 'en'

-b, --batchsize

Number of samples to process at the same time.

Default: 1

--rtype

Possible choices: table, audacity, kaleidoscope, csv

Specifies output format. Values in [‘table’, ‘audacity’, ‘kaleidoscope’, ‘csv’].

Default: {'table'}

--combine_results

Also outputs a combined file for all the selected result types. If not set combined tables will be generated.

Default: False

-c, --classifier

Path to custom trained classifier. If set, –lat, –lon and –locale are ignored.

--skip_existing_results

Skip files that have already been analyzed.

Default: False

--top_n

Saves only the top N predictions for each segment independent of their score. Threshold will be ignored.

--merge_consecutive

Maximum number of consecutive detections above MIN_CONF to merge for each detected species. This will result in fewer entires in the result file with segments longer than 3 seconds. Set to 0 or 1 to disable merging. Set to None to include all consecutive detections. We use the mean of the top 3 scores from all consecutive detections for merging.

Default: 1

birdnet_analyzer.client

This script will read an audio file, generate metadata from command line arguments and send it to the server. The server will then analyze the audio file and send back the detection results which will be stored as a JSON file.

usage: birdnet_analyzer.client [-h] [-o OUTPUT] [--lat LAT] [--lon LON]
                               [--week WEEK] [--slist SLIST]
                               [--sf_thresh SF_THRESH]
                               [--sensitivity SENSITIVITY] [--overlap OVERLAP]
                               [--host HOST] [-p PORT] [--pmode PMODE]
                               [--num_results NUM_RESULTS] [--save]
                               INPUT

Positional Arguments

INPUT: Path to input file or folder.

Named Arguments

-o, --output

Path to output folder. Defaults to the input path.

--lat

Recording location latitude. Set -1 to ignore.

Default: -1

--lon

Recording location longitude. Set -1 to ignore.

Default: -1

--week

Week of the year when the recording was made. Values in [1, 48] (4 weeks per month). Set -1 for year-round species list.

Default: -1

--slist

Path to species list file or folder. If folder is provided, species list needs to be named “species_list.txt”. If lat and lon are provided, this list will be ignored.

--sf_thresh

Minimum species occurrence frequency threshold for location filter. Values in [0.0001, 0.99].

Default: 0.03

--sensitivity

Detection sensitivity; Higher values result in higher sensitivity. Values in [0.75, 1.25]. Values other than 1.0 will shift the sigmoid functionon the x-axis. Use complementary to the cut-off threshold.

Default: 1.0

--overlap

Overlap of prediction segments. Values in [0.0, 2.9].

Default: 0

--host

Host name or IP address of API endpoint server.

Default: 'localhost'

-p, --port

Port of API endpoint server.

Default: 8080

--pmode

Score pooling mode. Values in [‘avg’, ‘max’].

Default: 'avg'

--num_results

Number of results per request.

Default: 5

--save

Define if files should be stored on server.

Default: False

birdnet_analyzer.embeddings

Run embeddings.py to extract feature embeddings instead of class predictions. Result file will contain timestamps and lists of float values representing the embedding for a particular 3-second segment. Embeddings can be used for clustering or similarity analysis. Here is an example:

python -m birdnet_analyzer.embeddings example/ --threads 4 --batchsize 16

usage: birdnet_analyzer.embeddings [-h] [-db DATABASE] [--fmin FMIN]
                                   [--fmax FMAX] [--audio_speed AUDIO_SPEED]
                                   [--overlap OVERLAP] [-t THREADS]
                                   [-b BATCHSIZE] [-i INPUT]

Named Arguments

-db, --database

Path to the database folder.

--fmin

Minimum frequency for bandpass filter in Hz.

Default: 0

--fmax

Maximum frequency for bandpass filter in Hz.

Default: 15000

--audio_speed

Speed factor for audio playback. Values < 1.0 will slow down the audio, values > 1.0 will speed it up. At a 10x decrease (audio speed 0.1), a 384 kHz recording becomes a 38.4 kHz recording.

Default: 1.0

--overlap

Overlap of prediction segments. Values in [0.0, 2.9].

Default: 0

-t, --threads

Number of CPU threads.

Default: 2

-b, --batchsize

Number of samples to process at the same time.

Default: 1

-i, --input

Path to input file or folder.

birdnet_analyzer.segments

After the analysis, run segments.py to extract short audio segments for species detections to verify results. This way, it might be easier to review results instead of loading hundreds of result files manually.

usage: birdnet_analyzer.segments [-h] [--audio_speed AUDIO_SPEED] [-t THREADS]
                                 [--min_conf MIN_CONF] [-r RESULTS]
                                 [-o OUTPUT] [--max_segments MAX_SEGMENTS]
                                 [--seg_length SEG_LENGTH]
                                 INPUT

Positional Arguments

INPUT: Path to folder containing audio files.

Named Arguments

--audio_speed

Speed factor for audio playback. Values < 1.0 will slow down the audio, values > 1.0 will speed it up. At a 10x decrease (audio speed 0.1), a 384 kHz recording becomes a 38.4 kHz recording.

Default: 1.0

-t, --threads

Number of CPU threads.

Default: 2

--min_conf

Minimum confidence threshold. Values in [0.00001, 0.99].

Default: 0.25

-r, --results

Path to folder containing result files. Defaults to the input path.

-o, --output

Output folder path for extracted segments. Defaults to the input path.

--max_segments

Number of randomly extracted segments per species.

Default: 100

--seg_length

Minimum length of extracted segments in seconds. If a segment is shorter than this value, it will be padded with audio from the source file.

Default: 3.0

birdnet_analyzer.species

The year-round list may contain some species, that are not included in any list for a specific week. See kahst#211 for more details.

usage: birdnet_analyzer.species [-h] [--lat LAT] [--lon LON] [--week WEEK]
                                [--slist SLIST] [--sf_thresh SF_THRESH]
                                [--sortby {freq,alpha}]
                                OUTPUT

Positional Arguments

OUTPUT: Path to output file or folder. If this is a folder, file will be named ‘species_list.txt’.

Named Arguments

--lat

Recording location latitude. Set -1 to ignore.

Default: -1

--lon

Recording location longitude. Set -1 to ignore.

Default: -1

--week

Week of the year when the recording was made. Values in [1, 48] (4 weeks per month). Set -1 for year-round species list.

Default: -1

--slist

Path to species list file or folder. If folder is provided, species list needs to be named “species_list.txt”. If lat and lon are provided, this list will be ignored.

--sf_thresh

Minimum species occurrence frequency threshold for location filter. Values in [0.0001, 0.99].

Default: 0.03

--sortby

Possible choices: freq, alpha

Sort species by occurrence frequency or alphabetically. Values in [‘freq’, ‘alpha’].

Default: 'freq'

birdnet_analyzer.server

You can host your own analysis service and API by launching the birdnet_analyzer.server script. This will allow you to send files to this server, store submitted files, analyze them and send detection results back to a client. This could be a local service, running on a desktop PC, or a remote server. The API can be accessed locally or remotely through a browser or Python client (or any other client implementation).

Install one additional package with pip install bottle.

Start the server with python -m birdnet_analyzer.server. You can also specify a host name or IP and port number, e.g., python -m birdnet_analayzer.server --host localhost --port 8080.

The server is single-threaded, so you’ll need to start multiple instances for higher throughput. This service is intented for short audio files (e.g., 1-10 seconds).

Query the API with a client. You can use the provided Python client or any other client implementation. Request payload needs to be multipart/form-data with the following fields: audio for raw audio data as byte code, and meta for additional information on the audio file. Take a look at our example client implementation in the client.py script.

Parse results from the server. The server will send back a JSON response with the detection results. The response also contains a msg field, indicating success or error. Results consist of a sorted list of (species, score) tuples.

This is an example response:

{
   "msg": "success",
   "results": [
      [
            "Poecile atricapillus_Black-capped Chickadee",
            0.7889
      ],
      [
            "Spinus tristis_American Goldfinch",
            0.5028
      ],
      [
            "Junco hyemalis_Dark-eyed Junco",
            0.4943
      ],
      [
            "Baeolophus bicolor_Tufted Titmouse",
            0.4345
      ],
      [
            "Haemorhous mexicanus_House Finch",
            0.2301
      ]
   ]
}

usage: birdnet_analyzer.server [-h] [-t THREADS] [-l LOCALE] [--host HOST]
                               [-p PORT] [--spath SPATH]

Named Arguments

-t, --threads

Number of CPU threads.

Default: 2

-l, --locale

Locale for translated species common names. Values in [‘af’, ‘en_UK’, ‘de’, ‘it’, …].

Default: 'en'

--host

Host name or IP address of API endpoint server.

Default: '0.0.0.0'

-p, --port

Port of API endpoint server.

Default: 8080

--spath

Path to folder where uploaded files should be stored.

Default: 'uploads/'

birdnet_analyzer.train

You can train your own custom classifier on top of BirdNET. This is useful if you want to detect species that are not included in the default species list. You can also use this to train a classifier for a specific location or season.

All you need is a dataset of labeled audio files, organized in folders by species (we use folder names as labels). This also works for non-bird species, as long as you have a dataset of labeled audio files.

Audio files will be resampled to 48 kHz and converted into 3-second segments (we support different crop segemnattion modes for files longer than 3 seconds; we pad with random noise if the file is shorter). We recommend using at least 100 audio files per species (although training also works with less data).

You can download a sample training data set here.

Collect training data and organize in folders based on species names.
Species labels should be in the format <scientific name>_<species common name> (e.g., Poecile atricapillus_Black-capped Chickadee), but other formats work as well.
It can be helpful to include a non-event class. If you name a folder ‘Noise’, ‘Background’, ‘Other’ or ‘Silence’, it will be treated as a non-event class.
Run the training script with python birdnet_analyzer.train <path to training data folder> -o <path to trained classifier model output>.

The script saves the trained classifier model based on the best validation loss achieved during training. This ensures that the model saved is optimized for performance according to the chosen metric.

After training, you can use the custom trained classifier with the --classifier argument of the analyze.py script. If you want to use the custom classifier in Raven, make sure to set --model_format raven.

Note

Adjusting hyperparameters (e.g., number of hidden units, learning rate, etc.) can have a big impact on the performance of the classifier. We recommend trying different hyperparameter settings. If you want to automate this process, you can use the --autotune argument (in that case, make sure to install keras_tuner with pip install keras-tuner).

Example usage (when downloading and unzipping the sample training data set):

python -m birdnet_analyzer.train train_data/ -o checkpoints/custom/Custom_Classifier.tflite
python -m birdnet_analyzer.analyze example/ --classifier checkpoints/custom/Custom_Classifier.tflite

Note

Setting a custom classifier will also set the new labels file. Due to these custom labels, the location filter and locale will be disabled.

Negative samples

You can include negative samples for classes by prefixing the folder names with a ‘-’ (e.g., -Poecile atricapillus_Black-capped Chickadee). Do this with samples that definitely do not contain the species. Negative samples will only be used for training and not for validation. Also keep in mind that negative samples will only be used when a corresponding folder with positive samples exists. Negative samples cannot be used for binary classification, instead include these samples in the non-event folder.

Multi-label data

To train with multi-label data separate the class labels with commas in the folder names (e.g., Poecile atricapillus_Black-capped Chickadee,Cardinalis cardinalis_Northern Cardinal). This can also be combined with negative samples as described above. The validation split will be performed combination of classes, so you might want to ensure sufficient data for each combination of classes. When using multi-label data the upsampling mode will be limited to ‘repeat’.

Note

Custom classifiers trained with BirdNET-Analyzer are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

usage: birdnet_analyzer.train [-h] [--fmin FMIN] [--fmax FMAX]
                              [--audio_speed AUDIO_SPEED] [-t THREADS]
                              [--overlap OVERLAP] [--crop_mode CROP_MODE]
                              [-o OUTPUT] [--epochs EPOCHS]
                              [--batch_size BATCH_SIZE]
                              [--val_split VAL_SPLIT]
                              [--learning_rate LEARNING_RATE]
                              [--hidden_units HIDDEN_UNITS]
                              [--dropout DROPOUT] [--mixup]
                              [--upsampling_ratio UPSAMPLING_RATIO]
                              [--upsampling_mode {repeat,mean,smote}]
                              [--model_format {tflite,raven,both}]
                              [--model_save_mode {replace,append}]
                              [--cache_mode {load,save}]
                              [--cache_file CACHE_FILE] [--autotune]
                              [--autotune_trials AUTOTUNE_TRIALS]
                              [--autotune_executions_per_trial AUTOTUNE_EXECUTIONS_PER_TRIAL]
                              INPUT

Positional Arguments

INPUT: Path to training data folder. Subfolder names are used as labels.

Named Arguments

--fmin

Minimum frequency for bandpass filter in Hz.

Default: 0

--fmax

Maximum frequency for bandpass filter in Hz.

Default: 15000

--audio_speed

Speed factor for audio playback. Values < 1.0 will slow down the audio, values > 1.0 will speed it up. At a 10x decrease (audio speed 0.1), a 384 kHz recording becomes a 38.4 kHz recording.

Default: 1.0

-t, --threads

Number of CPU threads.

Default: 2

--overlap

Overlap of training data segments in seconds if crop_mode is ‘segments’.

Default: 0

--crop_mode

Crop mode for training data. Can be ‘center’, ‘first’ or ‘segments’.

Default: 'center'

-o, --output

Path to trained classifier model output.

Default: 'checkpoints/custom/Custom_Classifier'

--epochs

Number of training epochs.

Default: 50

--batch_size

Batch size.

Default: 32

--val_split

Validation split ratio.

Default: 0.2

--learning_rate

Learning rate.

Default: 0.001

--hidden_units

Number of hidden units. If set to >0, a two-layer classifier is used.

Default: 0

--dropout

Dropout rate.

Default: 0.0

--mixup

Whether to use mixup for training.

Default: False

--upsampling_ratio

Balance train data and upsample minority classes. Values between 0 and 1.

Default: 0.0

--upsampling_mode

Possible choices: repeat, mean, smote

Upsampling mode.

Default: 'repeat'

--model_format

Possible choices: tflite, raven, both

Model output format.

Default: 'tflite'

--model_save_mode

Possible choices: replace, append

Model save mode. ‘replace’ will overwrite the original classification layer and ‘append’ will combine the original classification layer with the new one.

Default: 'replace'

--cache_mode

Possible choices: load, save

Cache mode. Can be ‘load’ or ‘save’.

--cache_file

Path to cache file.

Default: 'train_cache.npz'

--autotune

Whether to use automatic hyperparameter tuning (this will execute multiple training runs to search for optimal hyperparameters).

Default: False

--autotune_trials

Number of training runs for hyperparameter tuning.

Default: 50

--autotune_executions_per_trial

The number of times a training run with a set of hyperparameters is repeated during hyperparameter tuning (this reduces the variance).

Default: 1