Skip to main content

WilData ROI Configuration

Reference for the create-roi-dataset and bulk-create-roi-datasets YAML configuration files.

Overview

ROI (Region of Interest) configs control how classification datasets are generated from detection annotations. The process extracts image crops around each annotated bounding box and generates random background crops from unannotated regions.

Usage:

# Single ROI dataset
wildata create-roi-dataset -c configs/roi-create-config.yaml

# Bulk ROI creation
wildata bulk-create-roi-datasets -c configs/bulk-roi-create-config.yaml

Single ROI Config

Source Fields

FieldTypeDescription
source_pathstrPath to source annotation file (YOLO data.yaml or COCO/LS JSON)
source_formatstrSource format: coco, yolo, or ls
dataset_namestrName for the ROI dataset
rootstrRoot directory where data is stored
split_namestrDataset split: train, val, or test

Label Studio Fields

FieldTypeDefaultDescription
ls_xml_configstrNonePath to Label Studio XML config
ls_parse_configbooltrueParse LS config dynamically
bbox_toleranceint5Bbox validation tolerance
draw_original_bboxesboolfalseDraw original bboxes on ROI crops

ROI Parameters

FieldTypeDefaultDescription
roi_config.random_roi_countint1Number of random background ROIs per image
roi_config.roi_box_sizeint384Size of extracted ROI crops in pixels
roi_config.min_roi_sizeint32Minimum detection size for ROI extraction
roi_config.dark_thresholdfloat0.5Dark pixel ratio threshold for filtering crops
roi_config.background_classstrbackgroundName for background class label
roi_config.save_formatstrjpgImage format for saved crops
roi_config.qualityint95JPEG quality for saved crops

Bulk ROI Config

For bulk-create-roi-datasets, uses source_paths (list of directories):

source_paths:
- D:/annotations/batch1/
- D:/annotations/batch2/

source_format: ls
root: D:/data
split_name: val
# ... same roi_config fields

Complete Example

source_path: configs/yolo_data.yaml
source_format: yolo
dataset_name: wildlife_roi
root: D:/data
split_name: val
bbox_tolerance: 5
draw_original_bboxes: false
ls_xml_config: null
ls_parse_config: true

roi_config:
random_roi_count: 1
roi_box_size: 384
min_roi_size: 32
dark_threshold: 0.5
background_class: "background"
save_format: "jpg"
quality: 95

Output Structure

The command generates a classification-ready directory structure:

data/<dataset_name>/roi/
├── <split>/
│ ├── <class_name>/
│ │ ├── image_001_roi_0.jpg
│ │ ├── image_001_roi_1.jpg
│ │ └── ...
│ ├── background/
│ │ ├── image_001_bg_0.jpg
│ │ └── ...
│ └── ...

See also: