Usage¶

LitDet provides a suite of CLI tools for the complete machine learning lifecycle: training (light-train), evaluation (light-eval) and prediction (light-predict).

It leverages the Hydra configuration framework to facilitate automated, reproducible, and scalable workflows. Thanks to Hydra capabilities, execution can be parameterized in two ways:

Direct CLI Overrides: Ideal for rapid experimentation and debugging. You can modify parameters directly via the command line (e.g., trainer.max_epochs=100) without needing to edit the underlying files.
YAML Configuration Composition (Recommended): Best suited for tracking ML experiments and facilitating team collaboration. You can combine multiple YAML modules to build reproducible workflows, or swap parameters without altering source code (e.g., using model=faster-rcnn vs. model=ssd).

Note

This page covers the standard generic workflow. To learn how to fully customize the core modules, please refer to our Advanced Configuration Guide or the Python API.

The LitDet Pipeline¶

The typical LitDet workflow follows a standard ML pipeline: prepare your dataset, train a model to generate a checkpoint, and then pass that checkpoint to the evaluation or prediction scripts.

1. Dataset preparation¶

Before running any experiment, your dataset must follow a COCO [LMB+15] directory structure, separating raw images from their JSON annotation files across your training, validation, and testing splits.

Here is the required layout:

my-coco-dataset/
├── annotations/
│   ├── train/
│   │   └── instances.json
│   ├── valid/
│   │   └── instances.json
│   └── test/
│       └── instances.json
└── images/
    ├── train/
    │   ├── 000000000001.jpg
    │   └── ...
    ├── valid/
    │   ├── 000000000050.jpg
    │   └── ...
    └── test/
        ├── 000000000099.jpg
        └── ...

Where:

images/ contains the raw image files, divided into the different partitions.
annotations/ has the corresponding partition subdirectories. Each holds the master JSON file containing all bounding box coordinates, categories, and metadata for that specific split.

Make sure to perform a quality control of your dataset using tools such as coco-viewer.

Warning

Due to torchvision dataset annotations handling, the background category id is reserved to 0. If your annotated dataset does include the __background__ label with id 0, you should set data.include_background: True from the dataloader configuration.

For the exact dictionary schema required inside those JSON files, refer to the official COCO Data Format documentation.

Note

We plan to support other popular detection formats such as VOC [EVGW+10] and YOLO darknet [BWL20].

2. Training (`light-train`)¶

The light-train command is the entry point of your experiment. By default, this script handles both training the model and automatically running a test evaluation at the end using the best weights found during training.

Basic Training Run: Assuming you have a COCO-formatted dataset at /path/to/dir/my-coco-dataset:

light-train task.model=ssd paths.data_dir=/path/to/dir data.data_name=my-coco-dataset

Update model architecture: Change the model architecture, and ensure the number of classes is set so the model architecture is aligned with your specific dataset annotations.

light-train task.model=ssdlite task.model.num_class=3 paths.data_dir=/path/to/dir data.data_name=my-coco-dataset

Warning

Torchvision pre-trained models will always include the background category, so the number of classes needs to be increased by one. For example, you should set model.num_class=3 if your COCO annotation file has:

"categories": [
    {
        "id": 1,
        "name": "cat"
    },
    {
        "id": 2,
        "name": "dog"
    }
]

More information in this issue .

Visualize training batches To verify your data augmentation pipeline and inspect the exact images being fed into the model, you can append the batch_visualizer callback:

light-train paths.data_dir=/path/to/dir data.data_name=my-coco-dataset +callbacks=[batch_visualizer]

Hardware & ML Parameters Overrides: You can quickly switch hardware or adjust ML hyper-parameters directly from the CLI:

light-train trainer=gpu trainer.max_epochs=100 data.batch_size=16 task.optimizer.lr=1e-4

Resume your training: Restoring training state is essential for long-running experiments. Use the ckpt_path argument to initialize the trainer with a saved checkpoint:

light-train trainer=gpu ckpt_path="/path/to/name.ckpt"

Disabling the Test Phase: If you strictly want to train and skip the automatic evaluation phase at the end, set test=False:

light-train test=False

Once light-train completes, it will save the .ckpt file in your output directory (defined in paths.output_dir).

3. Evaluation (`light-eval`)¶

The light-eval script evaluates an existing model against the test partition of your dataset. It requires a checkpoint path to run:

light-eval ckpt_path=/path/to/best_model.ckpt

You can append any parameter just like in training (e.g., evaluating on specific hardware or modifying batch sizes).

We also provide default metrics built directly from the lightning framework.

Detection metrics: Model evaluation relies on the Mean Avegare Precision for different object sizes (small, medium and large). It is computed using the pycocotools backend.

Classification metrics: Detection task can be reformulated as a classification problem under certain circumstances. With a greedy matching algorithm, each prediction is evaluated as a true positive if its Intersection over Union with a ground-truth bounding box exceeds 0.5. Under this paradigm, LitDet computes the F1-score, and plots confusion matrices and Precision-Recall curves using your active logger (e.g., Aim).

4. Prediction (`light-predict`)¶

The light-predict script runs inference on a folder with just the list of images (unannotated data), so no need to comply to the COCO dataset format. Like evaluation, it strictly requires a checkpoint path.

By default, it saves your inference results to disk with the selected dataset annotation format.

light-predict ckpt_path=/path/to/best_model.ckpt paths.data_dir=/path/to/dir paths.output_dir=/path/to/predictions/dir

Warning

Writing the annotation format to disk is available only during the predict stage, not for the other stages (i.e. fit, validate or test).

Visualizing Predictions: If you want to also visually inspect the predictions, you can append the prediction_visualizer callback to the list:

light-predict ckpt_path=/path/to/best_model.ckpt +callbacks=[prediction_visualizer]

Stop saving predictions: To stop writing predictions to disk when running an inference, remove the prediction_writer callback:

light-predict ckpt_path=/path/to/best_model.ckpt '~callbacks.prediction_writer'

Hydra configuration¶

LitDet relies on Hydra to manage complex, hierarchical configurations. Rather than maintaining a single file, parameters are organized into modular YAML components (e.g., models, datasets, hardware) that are dynamically composed at runtime.

Below you will find an example of a configuration file:

YAML configuration example¶

# @package _global_

# to execute this experiment run:
# light-train +experiment=detect_example

defaults:
  - override /data: coco
  - override /task: detect
  - override /task/model: faster_rcnn
  - override /callbacks: [default, prediction_visualizer] # Add a callback to the default one
  - override /trainer: default
  - override /logger: [aim, csv, tensorboard] # Choose your loggers

# all parameters below will be merged with parameters from default configurations set above
# this allows you to overwrite only specified parameters

tags: ["detection", "coco", "fasterrcnn_resnet50_fpn_v2"]

seed: 12345

trainer:
  max_epochs: 12

data:
  auto_download_dataset: true
  is_contiguous: False # this is specific to coco2017, since they included 10 never used categories
  include_background: False # coco2017 does not include backgroud label
  train_transforms:
    # Rewrite all your transformation
    - _target_: torchvision.transforms.v2.RandomPhotometricDistort
      hue: [-0.1, 0.1]

    - _target_: torchvision.transforms.v2.RandomHorizontalFlip

    - _target_: torchvision.transforms.v2.ScaleJitter
      target_size: [1024, 1024]

  # Add transformations to the testing transformation pipeline
  test_transforms:
    - _target_: torchvision.transforms.v2.RandomHorizontalFlip

task:
  model:
    # Use Faster R-CNN pre-trained weights
    weights:
      _target_: hydra.utils.get_object
      path: torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT

    # Change the number of classes accordingly to your dataset
    num_classes: 81

  # use a lr scheduler like reduce_on_plateau
  lr_scheduler_config:
    monitor: "train/loss"
    interval: "step"
    frequency: 1
    strict: True,
    name: None,
    scheduler:
      _target_: torch.optim.lr_scheduler.ReduceLROnPlateau
      _partial_: true
      mode: min
      factor: 0.1
      patience: 10000

logger:
  aim:
    experiment: "coco"

Configuration Overview:

The file begins with # @package _global_, which tells Hydra that parameters defined in this file should be applied directly to the root of your configuration.
Then, the defaults block is the core of Hydra’s composable architecture. It tells the framework which base configuration files to load and override (e.g. data=coco).
The remaining sections explicitly set or overwrite parameters for those base configurations (e.g. task.model.num_classes=81).

Once the configuration file is defined, initiate the run:

light-train +folder=config.yaml

As in the previous section, additional parameters can be also appended via the CLI.

Also, if you folder is not in the current working directory, you must append 'hydra.searchpath=[file:///path/to/my_folder/location]' at the end of the call.

Warning

Caution regarding Hydra parameter composition, it applies a “last argument takes precedence” logic and merges data rather than replacing it entirely.

Do not hesitate to check the online official Hydra documentation.

Kickstart your own configuration¶

Constructing a configuration file from scratch can be complex due to the extensive parameters exposed by LitDet. You are encouraged to check the configuration examples or the model zoo configurations.

To streamline project setup, we provides a Cookiecutter template. Execute the following command and follow the interactive prompts to instantiate a new project workspace:

cookiecutter https://gitlab.kitware.com/litdet/litdet.git --directory "cookiecutter-litdet"

Usage¶

The LitDet Pipeline¶

1. Dataset preparation¶

2. Training (light-train)¶

3. Evaluation (light-eval)¶

4. Prediction (light-predict)¶

Hydra configuration¶

Kickstart your own configuration¶

2. Training (`light-train`)¶

3. Evaluation (`light-eval`)¶

4. Prediction (`light-predict`)¶