WildTrain Architecture
WildTrain is a modular training framework that supports both object detection (YOLO, MMDetection) and classification (PyTorch Lightning) with integrated experiment tracking and model management.
Overview
Purpose: Flexible model training and evaluation framework
Key Responsibilities:
- Model training (detection and classification)
- Experiment tracking with MLflow
- Hyperparameter optimization
- Model evaluation and metrics
- Model registration and versioning
Architecture Diagram
Core Components
1. Model Architectures
Supported Architectures
WildTrain supports several model families out of the box:
- YOLO Detection: Integrated Ultralytics YOLO models for fast and accurate object detection.
- MMDetection: Support for advanced detection architectures including Faster R-CNN, Mask R-CNN, and Transformers.
- Classification Models: PyTorch Lightning-based classifiers for species identification using standard backbones like ResNet, EfficientNet, and Vision Transformers.
2. Data Modules
#Standardized data loading components that integrate with WilData to feed training and validation batches into the models:
- Detection DataModule: Handles bounding box datasets and associated image transformations.
- Classification DataModule: Handles ROI-based classification datasets with standard image augmentations.
3. Training Orchestration
Main Trainer
The training orchestrator manages the lifecycle of a training run, including experiment initialization, model instantiation, training loop execution, and model serialization.
4. Evaluation System
Metrics Computation
Integrated tools for calculating standard performance metrics:
- Classification Metrics: Accuracy, Precision, Recall, and F1-Score.
- Detection Metrics: Mean Average Precision (mAP) at various IoU thresholds.
5. Hyperparameter Optimization
Optuna Integration
WildTrain integrates with Optuna to automate the search for optimal training hyperparameters like learning rate, batch size, and architectural configurations.
6. Model Registration
MLflow Model Registry
Models are automatically registered in the MLflow Model Registry, allowing for versioning, lifecycle stage management (Staging/Production), and metadata tracking.
Configuration System
Hydra Configuration
WildTrain uses Hydra for flexible configuration management.
WildTrain uses Hydra for flexible configuration management, enabling hierarchical YAML configurations and CLI-based overrides.
Configuration Structure
# configs/main.yaml
defaults:
- model: yolo
- data: detection
- training: default
- _self_
experiment_name: wildlife_detection
seed: 42
# Override from CLI:
# python main.py model=custom data.batch_size=64
Model Configs
# configs/detection/yolo.yaml
model:
framework: "yolo"
size: "n" # n, s, m, l, x
pretrained: true
training:
epochs: 100
imgsz: 640
batch: 16
optimizer: "AdamW"
lr0: 0.001
CLI Interface
Training, evaluation, and tuning operations are all exposed via a CLI built with Typer. Complete documentation is available in the CLI Reference.