diff --git a/docs/trackers/cbiou.md b/docs/trackers/cbiou.md new file mode 100644 index 00000000..b7591a18 --- /dev/null +++ b/docs/trackers/cbiou.md @@ -0,0 +1,193 @@ +--- +title: C-BIoU — Cascaded-Buffered IoU Tracker | Trackers +comments: true +description: C-BIoU improves association under fast or irregualar motion by matching with Buffered IoU instead of plain IoU, using a ByteTrack-style pipeline. +--- + +# C-BIoU (Cascaded-Buffered IoU) + +## What is C-BIoU? + +C-BIoU builds on the same tracking pipeline as [ByteTrack](bytetrack.md) but replaces plain IoU with **Buffered IoU (BIoU)**, expanding boxes before overlap is computed so tracks and detections can still match when motion of the object is fast or boxes barely align. It runs two association passes with a small buffer first and a larger buffer second (`buffer_ratio_first` and `buffer_ratio_second`), so only bounding boxes are required. C-BIoU is a strong fit for sports and dance footage where objects move fast and change direction. + +## How does C-BIoU compare to other trackers? + +For comparisons with other trackers, plus default and tuned parameters, see the [tracker comparison](comparison.md) page. + +| Dataset | HOTA | IDF1 | MOTA | +| :--------: | :--: | :--: | :--: | +| MOT17 | 63.0 | 79.1 | 77.4 | +| SportsMOT | 73.1 | 72.6 | 96.7 | +| SoccerNet | 82.6 | 76.6 | 97.0 | +| DanceTrack | 53.8 | 53.8 | 90.1 | + +## How does C-BIoU work? + +C-BIoU keeps the [ByteTrack](bytetrack.md)-style association pipeline used in [BoT-SORT](botsort.md) but replaces plain IoU with **Cascaded Buffered IoU** at each association step. + +**First association (b1).** High-confidence detections are matched to confirmed and lost tracks using BIoU with `buffer_ratio_first` (paper **b1**, small buffer). Costs are fused with detection confidence. + +**Second association (b2).** Remaining *tracked* tracks (not lost) are matched to low-confidence detections using BIoU with `buffer_ratio_second` (paper **b2**, large buffer). In this implementation, this larger buffer corresponds to ByteTrack's recovery stage for unmatched tracks and low-confidence detections. + +**Unconfirmed association (b1).** Leftover high-confidence detections are matched to unconfirmed tracks using the same buffer as pass 1. Unmatched unconfirmed tracks are removed. This step is inherited from ByteTrack lifecycle logic, not the paper's two-buffer cascade. + +**Track lifecycle.** New tracks are initiated and confirmed with a conservative policy (`minimum_consecutive_frames`) to reduce one-frame false positives. Existing tracks that remain unmatched longer than `lost_track_buffer` are removed. + +## Key Parameters + +| Parameter | Purpose | Tuning guidance | +| ----------------------------------------- | --------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `lost_track_buffer` | Number of frames to keep an unmatched track alive before deletion. | Higher value tolerates longer occlusions but risks false re-association. Use range (10, 30) for most scenes; up to 60 for very long occlusions. | +| `track_activation_threshold` | Minimum detection confidence required to start a new track. | Higher value reduces noisy track creation; lower value retains harder objects. 0.5-0.9 typical depending on detector quality. This does not control low-confidence association, which still discards detections at a fixed `0.1` confidence floor. | +| `minimum_consecutive_frames` | Number of consecutive matches required before confirming a new track. | 1 for immediate activation; 2-3 improves robustness against flicker and false positives. | +| `minimum_iou_threshold_first_assoc` | Minimum fused BIoU similarity for the first association pass. | Lower value helps maintain matches under fast motion; higher value is stricter. | +| `minimum_iou_threshold_second_assoc` | Minimum BIoU similarity for the second association pass. | Usually set to a lower value than the first-pass threshold to recover weak detections without over-matching. | +| `minimum_iou_threshold_unconfirmed_assoc` | Minimum fused BIoU similarity when associating unconfirmed tracks. | Higher value makes tentative tracks harder to confirm spuriously; lower value helps short-lived or noisy objects survive. | +| `high_conf_det_threshold` | Confidence split between stage-1 and stage-2 detections. | 0.5-0.7 common. Higher value shifts more detections to recovery stage; lower value gives stage-1 broader coverage. | +| `buffer_ratio_first` | Paper **b1**, small BIoU buffer for the first association pass. | Typical range 0.1-0.7. Should be **less than** `buffer_ratio_second`. | +| `buffer_ratio_second` | Paper **b2**, large BIoU buffer for the second association pass. | Typical range 0.2-1.0. Should be **greater than** `buffer_ratio_first`. | + +!!! warning "Buffer ordering (b1 < b2)" + + Always set `buffer_ratio_first` < `buffer_ratio_second`. The cascaded matcher applies the **smaller** buffer first, then the **larger** buffer only on pairs that remain unmatched. Reversing the order (b1 ≥ b2) is not consistent with the paper and usually hurts performance. + +!!! warning "Frame input is ignored by C-BIoU" + + `CBIoUTracker.update()` accepts `frame` for API consistency with other trackers, but C-BIoU does not use image/frame pixels. + If you pass `frame` with a non-`None` value, the tracker emits a `UserWarning` and ignores it. + +## Run on video, webcam, or RTSP stream + +These examples use `opencv-python` for decoding and display. Replace ``, ``, and `` with your inputs. `` is usually 0 for the default camera. + +=== "Video" + + ```python + import cv2 + import supervision as sv + from rfdetr import RFDETRMedium + from trackers import CBIoUTracker + + tracker = CBIoUTracker() + model = RFDETRMedium() + + box_annotator = sv.BoxAnnotator() + label_annotator = sv.LabelAnnotator() + + video_capture = cv2.VideoCapture("") + if not video_capture.isOpened(): + raise RuntimeError("Failed to open video source") + + while True: + success, frame_bgr = video_capture.read() + if not success: + break + + frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB) + detections = model.predict(frame_rgb) + detections = tracker.update(detections) + + annotated_frame = box_annotator.annotate(frame_bgr, detections) + annotated_frame = label_annotator.annotate( + annotated_frame, + detections, + labels=detections.tracker_id, + ) + + cv2.imshow("RF-DETR + C-BIoU", annotated_frame) + if cv2.waitKey(1) & 0xFF == ord("q"): + break + + video_capture.release() + cv2.destroyAllWindows() + ``` + +=== "Webcam" + + ```python + import cv2 + import supervision as sv + from rfdetr import RFDETRMedium + from trackers import CBIoUTracker + + tracker = CBIoUTracker() + model = RFDETRMedium() + + box_annotator = sv.BoxAnnotator() + label_annotator = sv.LabelAnnotator() + + video_capture = cv2.VideoCapture("") + if not video_capture.isOpened(): + raise RuntimeError("Failed to open webcam") + + while True: + success, frame_bgr = video_capture.read() + if not success: + break + + frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB) + detections = model.predict(frame_rgb) + detections = tracker.update(detections) + + annotated_frame = box_annotator.annotate(frame_bgr, detections) + annotated_frame = label_annotator.annotate( + annotated_frame, + detections, + labels=detections.tracker_id, + ) + + cv2.imshow("RF-DETR + C-BIoU", annotated_frame) + if cv2.waitKey(1) & 0xFF == ord("q"): + break + + video_capture.release() + cv2.destroyAllWindows() + ``` + +=== "RTSP" + + ```python + import cv2 + import supervision as sv + from rfdetr import RFDETRMedium + from trackers import CBIoUTracker + + tracker = CBIoUTracker() + model = RFDETRMedium() + + box_annotator = sv.BoxAnnotator() + label_annotator = sv.LabelAnnotator() + + video_capture = cv2.VideoCapture("") + if not video_capture.isOpened(): + raise RuntimeError("Failed to open RTSP stream") + + while True: + success, frame_bgr = video_capture.read() + if not success: + break + + frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB) + detections = model.predict(frame_rgb) + detections = tracker.update(detections) + + annotated_frame = box_annotator.annotate(frame_bgr, detections) + annotated_frame = label_annotator.annotate( + annotated_frame, + detections, + labels=detections.tracker_id, + ) + + cv2.imshow("RF-DETR + C-BIoU", annotated_frame) + if cv2.waitKey(1) & 0xFF == ord("q"): + break + + video_capture.release() + cv2.destroyAllWindows() + ``` + +For BIoU mathematics and using `BIoU(buffer_ratio=...)` on other trackers, see [IoU variants](../learn/iou.md#biou). To tune hyperparameters with Optuna, see [Hyperparameter tuning](../learn/tune.md). + +## Reference + +Yang, F., Odashima, S., Masui, S., and Jiang, S. (2023). Hard To Track Objects with Irregular Motions and Similar Appearances? Make It Easier by Buffering the Matching Space. WACV 2023. [arXiv:2211.14317](https://arxiv.org/abs/2211.14317) diff --git a/docs/trackers/comparison.md b/docs/trackers/comparison.md index f515ddd0..53d0eea8 100644 --- a/docs/trackers/comparison.md +++ b/docs/trackers/comparison.md @@ -1,11 +1,11 @@ --- -title: SORT vs ByteTrack vs OC-SORT vs BoT-SORT — MOT Benchmark Comparison | Trackers -description: Side-by-side benchmark comparison of SORT, ByteTrack, OC-SORT, and BoT-SORT on MOT17, MOT20, DanceTrack, and SportsMOT — HOTA, IDF1, MOTA scores with default and tuned parameters. +title: SORT vs ByteTrack vs OC-SORT vs BoT-SORT vs C-BIoU — MOT Benchmark Comparison | Trackers +description: Side-by-side benchmark comparison of SORT, ByteTrack, OC-SORT, BoT-SORT, and C-BIoU on MOT17, DanceTrack, SportsMOT, and SoccerNet — HOTA, IDF1, MOTA with default and tuned parameters. --- # Tracker Comparison -This page shows head-to-head performance of SORT, ByteTrack, OC-SORT, and BoT-SORT on standard MOT benchmarks. Results are shown with default parameters and with parameter-tuned configurations found via grid search. +This page shows head-to-head performance of SORT, ByteTrack, OC-SORT, BoT-SORT, and C-BIoU on standard MOT benchmarks. Results are shown with default parameters and with parameter-tuned configurations found via grid search. !!! info "Benchmark version" @@ -38,7 +38,8 @@ Pedestrian tracking with crowded scenes and frequent occlusions. Strongly tests | SORT | 58.4 | 69.9 | 67.2 | | ByteTrack | 60.1 | 73.2 | 74.1 | | OC-SORT | 61.9 | 76.4 | 76.0 | - | BoT-SORT | **63.7** | **78.7** | **79.2** | + | BoT-SORT | **63.7** | 78.7 | **79.2** | + | C-BIoU | 63.0 | **79.1** | 77.4 | === "Tuned" @@ -49,7 +50,8 @@ Pedestrian tracking with crowded scenes and frequent occlusions. Strongly tests | SORT | 60.4 | 72.5 | 75.8 | | ByteTrack | 60.5 | 72.7 | 76.1 | | OC-SORT | 62.0 | 76.5 | 77.3 | - | BoT-SORT | **63.8** | **78.7** | **79.4** | + | BoT-SORT | **63.8** | 78.7 | **79.4** | + | C-BIoU | 63.0 | **79.1** | 77.4 | Tuned configuration for each tracker. @@ -85,6 +87,18 @@ Pedestrian tracking with crowded scenes and frequent occlusions. Strongly tests track_activation_threshold: 0.6 enable_cmc: true cmc_method: sparseOptFlow + + C-BIoU: + lost_track_buffer: 30 + minimum_consecutive_frames: 2 + minimum_iou_threshold_first_assoc: 0.2 + minimum_iou_threshold_second_assoc: 0.5 + minimum_iou_threshold_unconfirmed_assoc: 0.3 + high_conf_det_threshold: 0.6 + track_activation_threshold: 0.7 + buffer_ratio_first: 0.3 + buffer_ratio_second: 0.5 + enable_cmc: false ``` ## [SportsMOT](https://arxiv.org/abs/2304.05170) @@ -111,6 +125,7 @@ Sports broadcast tracking with fast motion, camera pans, and similar-looking tar | ByteTrack | 73.0 | 72.5 | 96.4 | | OC-SORT | 71.7 | 71.4 | 95.0 | | BoT-SORT | **73.8** | **73.4** | **96.9** | + | C-BIoU | 73.1 | 72.6 | 96.7 | === "Tuned" @@ -122,6 +137,7 @@ Sports broadcast tracking with fast motion, camera pans, and similar-looking tar | ByteTrack | 73.3 | 73.5 | 95.9 | | OC-SORT | 74.0 | **75.4** | 95.6 | | BoT-SORT | **74.1** | 74.0 | **96.9** | + | C-BIoU | 73.1 | 72.6 | 96.7 | Tuned configuration for each tracker. @@ -157,6 +173,18 @@ Sports broadcast tracking with fast motion, camera pans, and similar-looking tar track_activation_threshold: 0.8 enable_cmc: true cmc_method: sparseOptFlow + + C-BIoU: + lost_track_buffer: 30 + minimum_consecutive_frames: 2 + minimum_iou_threshold_first_assoc: 0.2 + minimum_iou_threshold_second_assoc: 0.5 + minimum_iou_threshold_unconfirmed_assoc: 0.3 + high_conf_det_threshold: 0.6 + track_activation_threshold: 0.7 + buffer_ratio_first: 0.3 + buffer_ratio_second: 0.5 + enable_cmc: false ``` ## [SoccerNet-tracking](https://arxiv.org/abs/2204.06918) @@ -184,6 +212,7 @@ Long sequences with dense interactions and partial occlusions. Tests long-term I | ByteTrack | 84.0 | 78.1 | **97.8** | | OC-SORT | 78.4 | 72.6 | 94.1 | | BoT-SORT | **84.5** | **79.3** | 96.6 | + | C-BIoU | 82.6 | 76.6 | 97.0 | === "Tuned" @@ -191,10 +220,11 @@ Long sequences with dense interactions and partial occlusions. Tests long-term I | Tracker | HOTA | IDF1 | MOTA | | :-------: | :------: | :------: | :------: | - | SORT | 84.2 | 78.2 | **98.2** | - | ByteTrack | 84.0 | 78.1 | **98.2** | + | SORT | 84.2 | 78.2 | 98.2 | + | ByteTrack | 84.0 | 78.1 | 98.2 | | OC-SORT | 82.9 | 77.9 | 96.8 | - | BoT-SORT | **85.0** | **79.7** | 97.2 | + | BoT-SORT | 85.0 | 79.7 | 97.2 | + | C-BIoU | **85.7** | **80.0** | **99.3** | Tuned configuration for each tracker. @@ -230,6 +260,17 @@ Long sequences with dense interactions and partial occlusions. Tests long-term I track_activation_threshold: 0.7 enable_cmc: true cmc_method: sparseOptFlow + + C-BIoU: + lost_track_buffer: 43 + minimum_consecutive_frames: 2 + minimum_iou_threshold_first_assoc: 0.05 + minimum_iou_threshold_second_assoc: 0.46 + minimum_iou_threshold_unconfirmed_assoc: 0.27 + high_conf_det_threshold: 0.40 + track_activation_threshold: 0.48 + buffer_ratio_first: 0.68 + buffer_ratio_second: 0.50 ``` ## [DanceTrack](https://arxiv.org/abs/2111.14690) @@ -261,8 +302,9 @@ Group dancing tracking with uniform appearance, diverse motions, and extreme art | :-------: | :------: | :------: | :------: | | SORT | 45.0 | 39.0 | 80.6 | | ByteTrack | 50.2 | 49.9 | 86.2 | - | OC-SORT | **51.8** | **50.9** | **87.3** | + | OC-SORT | 51.8 | 50.9 | 87.3 | | BoT-SORT | 50.5 | 49.2 | 85.1 | + | C-BIoU | **53.8** | **53.8** | **90.1** | === "Tuned" @@ -271,9 +313,10 @@ Group dancing tracking with uniform appearance, diverse motions, and extreme art | Tracker | HOTA | IDF1 | MOTA | | :-------: | :------: | :------: | :------: | | SORT | 50.6 | 49.6 | 84.3 | - | ByteTrack | **53.2** | **54.6** | 86.8 | - | OC-SORT | 52.0 | 51.8 | **87.2** | - | BoT-SORT | **53.5** | **54.0** | 86.5 | + | ByteTrack | 53.2 | 54.6 | 86.8 | + | OC-SORT | 52.0 | 51.8 | 87.2 | + | BoT-SORT | 53.5 | 54.0 | 86.5 | + | C-BIoU | **54.6** | **57.0** | **89.6** | Tuned configuration for each tracker. @@ -309,6 +352,17 @@ Group dancing tracking with uniform appearance, diverse motions, and extreme art track_activation_threshold: 0.7 enable_cmc: true cmc_method: sparseOptFlow + + C-BIoU: + lost_track_buffer: 53 + minimum_consecutive_frames: 2 + minimum_iou_threshold_first_assoc: 0.11 + minimum_iou_threshold_second_assoc: 0.57 + minimum_iou_threshold_unconfirmed_assoc: 0.30 + high_conf_det_threshold: 0.36 + track_activation_threshold: 0.83 + buffer_ratio_first: 0.23 + buffer_ratio_second: 0.33 ``` ## Methodology @@ -321,9 +375,10 @@ detector following the ByteTrack procedure). The source is noted per dataset abo ### Tuning -Best parameters per tracker and dataset were found via grid search, selecting the -configuration with the highest HOTA. Tuning and evaluation always use separate data -splits to reflect real-world usage: +Best parameters per tracker and dataset were found via grid search (SORT, ByteTrack, +OC-SORT, BoT-SORT) or Optuna (`n_trials=100`, objective HOTA, trial 0 = defaults for +C-BIoU), selecting the configuration with the highest HOTA on the tune split. Tuning and +evaluation always use separate data splits to reflect real-world usage: - Train + validation + test: tune on validation, report on test. - Train + validation: tune on train, report on validation. @@ -354,6 +409,8 @@ association, which reduces ID switches on panning or handheld footage. Use BoT-S broadcasts, drone video, or any scene where the camera moves frequently. The CMC overhead is small relative to the detector, so the trade-off favors identity stability over raw speed. +**C-BIoU** targets irregular motion and similar appearances when you want buffered, cascaded geometric matching without camera motion compensation. It is strongest on SoccerNet and DanceTrack in these benchmarks, and reaches the highest IDF1 on MOT17 among the trackers listed here. Use C-BIoU when BoT-SORT-style association is a good fit but CMC is unavailable or harmful, or when plain IoU matching is too strict. See [C-BIoU](cbiou.md) for buffer scales **b1** and **b2**. + ## Metric Definitions **HOTA** (Higher Order Tracking Accuracy) — the primary benchmark metric. HOTA decomposes diff --git a/mkdocs.yml b/mkdocs.yml index 8aa8f68d..270d29f0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -142,6 +142,7 @@ nav: - ByteTrack: trackers/bytetrack.md - OC-SORT: trackers/ocsort.md - BoT-SORT: trackers/botsort.md + - C-BIoU: trackers/cbiou.md - API Reference: - Trackers: api/trackers.md diff --git a/src/trackers/__init__.py b/src/trackers/__init__.py index a02c147f..c318a1fd 100644 --- a/src/trackers/__init__.py +++ b/src/trackers/__init__.py @@ -9,6 +9,7 @@ from trackers.annotators.trace import MotionAwareTraceAnnotator from trackers.core.botsort.tracker import BoTSORTTracker from trackers.core.bytetrack.tracker import ByteTrackTracker +from trackers.core.cbiou.tracker import CBIoUTracker from trackers.core.ocsort.tracker import OCSORTTracker from trackers.core.sort.tracker import SORTTracker from trackers.datasets.download import download_dataset @@ -31,6 +32,7 @@ "BaseIoU", "BoTSORTTracker", "ByteTrackTracker", + "CBIoUTracker", "CIoU", "CMCConfig", "CMCMethod", diff --git a/src/trackers/core/cbiou/__init__.py b/src/trackers/core/cbiou/__init__.py new file mode 100644 index 00000000..57226e88 --- /dev/null +++ b/src/trackers/core/cbiou/__init__.py @@ -0,0 +1,5 @@ +# ------------------------------------------------------------------------ +# Trackers +# Copyright (c) 2026 Roboflow. All Rights Reserved. +# Licensed under the Apache License, Version 2.0 [see LICENSE for details] +# ------------------------------------------------------------------------ diff --git a/src/trackers/core/cbiou/tracker.py b/src/trackers/core/cbiou/tracker.py new file mode 100644 index 00000000..879bf80a --- /dev/null +++ b/src/trackers/core/cbiou/tracker.py @@ -0,0 +1,282 @@ +# ------------------------------------------------------------------------ +# Trackers +# Copyright (c) 2026 Roboflow. All Rights Reserved. +# Licensed under the Apache License, Version 2.0 [see LICENSE for details] +# ------------------------------------------------------------------------ + +from typing import ClassVar, cast + +import numpy as np +import supervision as sv + +from trackers.core.botsort.tracker import BoTSORTTracker +from trackers.core.botsort.tracklet import BoTSORTTracklet +from trackers.core.botsort.utils import _fuse_score, get_alive_tracklets +from trackers.utils.detections import default_confidences +from trackers.utils.iou import BIoU +from trackers.utils.state_representations import BaseStateEstimator, XCYCWHStateEstimator + + +class CBIoUTracker(BoTSORTTracker): + """Cascaded-Buffered IoU (C-BIoU) tracker. + + Implements the matching strategy from Yang et al., *Hard To Track Objects with + Irregular Motions and Similar Appearances? Make It Easier by Buffering the + Matching Space*, WACV 2023 + ([paper](https://openaccess.thecvf.com/content/WACV2023/papers/Yang_Hard_To_Track_Objects_With_Irregular_Motions_and_Similar_Appearances_WACV_2023_paper.pdf)). + + The paper proposes **Buffered IoU (BIoU)** — expanding boxes by a proportional + margin before computing overlap — and **cascaded matching** with a small buffer + scale ``b1`` followed by a larger scale ``b2`` (typically ``b1 < b2``). + + Each association step uses its own ``buffer_ratio``: + + * ``buffer_ratio_first`` — first pass (high-confidence detections vs tracks; + paper: small ``b1``). + * ``buffer_ratio_second`` — second pass (low-confidence detections; + paper: large ``b2``). + + The ByteTrack-style unconfirmed-track step (leftover high-confidence + detections vs tentative tracks) reuses **b1** (``iou_first``). + + Camera motion compensation is not used (detection-only / MOT-file workflows). + + Args: + lost_track_buffer: Time buffer (in frames at 30 FPS) for keeping lost + tracks alive before deletion. Scaled by ``frame_rate``. + frame_rate: Video frame rate used to scale the lost track buffer. + track_activation_threshold: Minimum detection confidence to spawn a + new track. + minimum_consecutive_frames: Number of successful updates required + before assigning a stable track ID. + minimum_iou_threshold_first_assoc: Minimum fused similarity for the + first association step. + minimum_iou_threshold_second_assoc: Minimum fused similarity for the + second association step. + minimum_iou_threshold_unconfirmed_assoc: Minimum fused similarity for + the unconfirmed association step. + high_conf_det_threshold: Confidence threshold splitting high / low + detections. + instant_first_frame_activation: If ``True``, first-frame tracks receive + a real ID immediately. + state_estimator_class: Kalman state representation for tracklets. + buffer_ratio_first: Buffer scale ``b1`` for the first BIoU pass. It is suggested to + be **less than** ``buffer_ratio_second`` (``b1 < b2``) per the paper. + buffer_ratio_second: Buffer scale ``b2`` for the second BIoU pass. It is suggested to + be **greater than** ``buffer_ratio_first``. + """ + + tracker_id = "cbiou" + search_space: ClassVar[dict[str, dict]] = { + "lost_track_buffer": {"type": "randint", "range": [10, 91]}, + "track_activation_threshold": {"type": "uniform", "range": [0.1, 0.9]}, + "minimum_iou_threshold_first_assoc": {"type": "uniform", "range": [0.05, 0.7]}, + "minimum_iou_threshold_second_assoc": {"type": "uniform", "range": [0.05, 0.7]}, + "minimum_iou_threshold_unconfirmed_assoc": { + "type": "uniform", + "range": [0.05, 0.7], + }, + "high_conf_det_threshold": {"type": "uniform", "range": [0.3, 0.8]}, + "minimum_consecutive_frames": {"type": "randint", "range": [1, 3]}, + "buffer_ratio_first": {"type": "uniform", "range": [0.0, 0.7]}, + "buffer_ratio_second": {"type": "uniform", "range": [0.0, 0.7]}, + } + + def __init__( + self, + lost_track_buffer: int = 30, + frame_rate: float = 30.0, + track_activation_threshold: float = 0.7, + minimum_consecutive_frames: int = 2, + minimum_iou_threshold_first_assoc: float = 0.2, + minimum_iou_threshold_second_assoc: float = 0.5, + minimum_iou_threshold_unconfirmed_assoc: float = 0.3, + high_conf_det_threshold: float = 0.6, + instant_first_frame_activation: bool = True, + state_estimator_class: type[BaseStateEstimator] = XCYCWHStateEstimator, + buffer_ratio_first: float = 0.3, + buffer_ratio_second: float = 0.5, + ) -> None: + super().__init__( + lost_track_buffer=lost_track_buffer, + frame_rate=frame_rate, + track_activation_threshold=track_activation_threshold, + minimum_consecutive_frames=minimum_consecutive_frames, + minimum_iou_threshold_first_assoc=minimum_iou_threshold_first_assoc, + minimum_iou_threshold_second_assoc=minimum_iou_threshold_second_assoc, + minimum_iou_threshold_unconfirmed_assoc=minimum_iou_threshold_unconfirmed_assoc, + high_conf_det_threshold=high_conf_det_threshold, + enable_cmc=False, + instant_first_frame_activation=instant_first_frame_activation, + state_estimator_class=state_estimator_class, + ) + self.iou_first = BIoU(buffer_ratio=buffer_ratio_first) + self.iou_second = BIoU(buffer_ratio=buffer_ratio_second) + self.buffer_ratio_first = buffer_ratio_first + self.buffer_ratio_second = buffer_ratio_second + + def _biou_matrix( + self, + tracklets: list[BoTSORTTracklet], + boxes: np.ndarray, + iou: BIoU, + ) -> np.ndarray: + if len(tracklets) == 0: + track_boxes = np.empty((0, 4)) + else: + track_boxes = np.array([t.get_state_bbox() for t in tracklets]) + return iou.compute(track_boxes, boxes) + + def update( + self, + detections: sv.Detections, + frame: np.ndarray | None = None, + ) -> sv.Detections: + """Update the C-BIoU tracker with detections from the current frame. + + Runs the association pipeline with a distinct BIoU instance per step + (cascaded buffers per Yang et al., WACV 2023). Does not use frames or CMC. + + Args: + detections: Supervision detections for the current frame. + frame: Unused. Emits a ``UserWarning`` if provided. + + Returns: + Detections with ``tracker_id`` assigned. + """ + self._warn_if_frame_unused(frame) + self.frame_id += 1 + + if len(self.tracks) == 0 and len(detections) == 0: + result = sv.Detections.empty() + result.tracker_id = np.array([], dtype=int) + return result + + out_det_indices: list[int] = [] + out_tracker_ids: list[int] = [] + + # Predict new locations for existing tracks + for tracker in self.tracks: + tracker.predict() + + detection_boxes = detections.xyxy + confidences = default_confidences(detections) + + # Split detections into high / low / discarded by confidence + high_mask = confidences >= self.high_conf_det_threshold + low_mask = (confidences > 0.1) & (~high_mask) + high_indices = np.where(high_mask)[0] + low_indices = np.where(low_mask)[0] + high_boxes = detection_boxes[high_indices] + low_boxes = detection_boxes[low_indices] + high_scores = confidences[high_indices] + + # Split tracks into confirmed, unconfirmed, and lost. + # After predict(), time_since_update == 1 means "tracked"; > 1 means "lost". + confirmed_tracks: list[BoTSORTTracklet] = [] + unconfirmed_tracks: list[BoTSORTTracklet] = [] + lost_tracks: list[BoTSORTTracklet] = [] + for track in self.tracks: + if track.time_since_update > 1: + lost_tracks.append(track) + elif track.number_of_successful_updates >= self.minimum_consecutive_frames: + confirmed_tracks.append(track) + else: + unconfirmed_tracks.append(track) + + # Step 1: associate high-confidence detections to confirmed + lost tracks. + # Paper b1 (small buffer); BIoU fused with detection scores. + strack_pool = confirmed_tracks + lost_tracks + iou_matrix = self._biou_matrix(strack_pool, high_boxes, self.iou_first) + iou_matrix = _fuse_score(self.iou_first.normalize_for_fusion(iou_matrix), high_scores) + matched, unmatched_pool, unmatched_high = self._get_associated_indices( + iou_matrix, self.minimum_iou_threshold_first_assoc + ) + + for row, col in matched: + track = strack_pool[row] + track.update(high_boxes[col]) + if track.number_of_successful_updates >= self.minimum_consecutive_frames and track.tracker_id == -1: + track.tracker_id = BoTSORTTracklet.get_next_tracker_id() + out_det_indices.append(int(high_indices[col])) + out_tracker_ids.append(track.tracker_id) + + # Step 2: associate low-confidence detections to remaining *tracked* tracks + # only (excluding lost tracks). Paper b2 (large buffer); no score fusion. + remaining_tracked = [strack_pool[i] for i in unmatched_pool if strack_pool[i].time_since_update == 1] + iou_matrix = self._biou_matrix(remaining_tracked, low_boxes, self.iou_second) + matched, _, unmatched_low = self._get_associated_indices(iou_matrix, self.minimum_iou_threshold_second_assoc) + + for row, col in matched: + track = remaining_tracked[row] + track.update(low_boxes[col]) + if track.number_of_successful_updates >= self.minimum_consecutive_frames and track.tracker_id == -1: + track.tracker_id = BoTSORTTracklet.get_next_tracker_id() + out_det_indices.append(int(low_indices[col])) + out_tracker_ids.append(track.tracker_id) + + # Unmatched low-confidence detections (output with tracker_id=-1) + for det_local_idx in sorted(unmatched_low): + out_det_indices.append(int(low_indices[det_local_idx])) + out_tracker_ids.append(-1) + + # Step 3: match unconfirmed tracks with remaining unmatched high-confidence + # detections (ByteTrack lifecycle; reuses b1 / iou_first). + # Unmatched unconfirmed tracks are removed (not kept as lost). + unmatched_high_list = sorted(unmatched_high) + unmatched_uc_indices: list[int] = list(range(len(unconfirmed_tracks))) + + if len(unconfirmed_tracks) > 0 and len(unmatched_high_list) > 0: + uh_boxes = high_boxes[unmatched_high_list] + uh_scores = high_scores[unmatched_high_list] + iou_matrix = self._biou_matrix(unconfirmed_tracks, uh_boxes, self.iou_first) + iou_matrix = _fuse_score(self.iou_first.normalize_for_fusion(iou_matrix), uh_scores) + matched_uc, unmatched_uc_indices, remaining_uh = self._get_associated_indices( + iou_matrix, self.minimum_iou_threshold_unconfirmed_assoc + ) + + for row, col in matched_uc: + track = unconfirmed_tracks[row] + orig_high_idx = unmatched_high_list[col] + track.update(high_boxes[orig_high_idx]) + if track.number_of_successful_updates >= self.minimum_consecutive_frames and track.tracker_id == -1: + track.tracker_id = BoTSORTTracklet.get_next_tracker_id() + out_det_indices.append(int(high_indices[orig_high_idx])) + out_tracker_ids.append(track.tracker_id) + + # Only remaining unmatched high-conf dets proceed to spawning + unmatched_high = [unmatched_high_list[i] for i in remaining_uh] + + # Remove unmatched unconfirmed tracks (following original ByteTrack) + if len(unmatched_uc_indices) > 0: + remove_ids = {id(unconfirmed_tracks[i]) for i in unmatched_uc_indices} + self.tracks = [t for t in self.tracks if id(t) not in remove_ids] + + # Spawn new tracks from unmatched high-confidence detections + self._spawn_new_tracks( + detection_boxes, + confidences, + unmatched_high, + high_indices, + out_det_indices, + out_tracker_ids, + is_first_frame=(self.frame_id == 1), + ) + + # Kill lost tracks + self.tracks = get_alive_tracklets( + tracklets=self.tracks, + maximum_frames_without_update=self.maximum_frames_without_update, + minimum_consecutive_frames=self.minimum_consecutive_frames, + ) + + if not out_det_indices: + result = sv.Detections.empty() + result.tracker_id = np.array([], dtype=int) + return result + + # Build final detections + idx = np.array(out_det_indices) + result = cast(sv.Detections, detections[idx]) + result.tracker_id = np.array(out_tracker_ids, dtype=int) + return result diff --git a/tests/core/shared_ids.py b/tests/core/shared_ids.py index 8e962c8d..911e2784 100644 --- a/tests/core/shared_ids.py +++ b/tests/core/shared_ids.py @@ -6,4 +6,8 @@ """Shared test constants for tracker IDs used across test/core files.""" -ALL_TRACKER_IDS = ["sort", "bytetrack", "ocsort", "botsort"] +ALL_TRACKER_IDS = ["sort", "bytetrack", "ocsort", "botsort", "cbiou"] + +# Trackers that accept a user-supplied ``iou=`` constructor argument. +# C-BIoU is intentionally excluded: it is opinionated and always uses BIoU. +IOU_TRACKER_IDS = [tid for tid in ALL_TRACKER_IDS if tid != "cbiou"] diff --git a/tests/core/test_cbiou_tracker.py b/tests/core/test_cbiou_tracker.py new file mode 100644 index 00000000..dd20b1ff --- /dev/null +++ b/tests/core/test_cbiou_tracker.py @@ -0,0 +1,206 @@ +# ------------------------------------------------------------------------ +# Trackers +# Copyright (c) 2026 Roboflow. All Rights Reserved. +# Licensed under the Apache License, Version 2.0 [see LICENSE for details] +# ------------------------------------------------------------------------ + +"""CBIoU-specific tracker tests. + +Generic lifecycle contracts are covered in test_trackers.py via ALL_TRACKER_IDS. +This file covers C-BIoU-specific invariants (Yang et al., WACV 2023): + - Cascaded BIoU with per-step buffer scales (b1, b2) + - CMC disabled; frame argument triggers UserWarning + - BIoU association more tolerant than standard IoU +""" + +from __future__ import annotations + +import warnings + +import numpy as np +import pytest +import supervision as sv + +from trackers.core.botsort.tracker import BoTSORTTracker +from trackers.core.cbiou.tracker import CBIoUTracker +from trackers.utils.iou import BIoU + + +def _detection(xyxy: tuple[float, float, float, float], conf: float = 0.9) -> sv.Detections: + return sv.Detections( + xyxy=np.array([xyxy], dtype=np.float32), + confidence=np.array([conf], dtype=np.float32), + ) + + +def _make_frame(h: int = 480, w: int = 640, seed: int = 42) -> np.ndarray: + rng = np.random.default_rng(seed) + return rng.integers(0, 255, (h, w, 3), dtype=np.uint8) + + +class TestCBIoUConstruction: + def test_default_construction(self) -> None: + tracker = CBIoUTracker() + assert tracker is not None + + def test_per_step_biou_instances(self) -> None: + tracker = CBIoUTracker( + buffer_ratio_first=0.1, + buffer_ratio_second=0.3, + ) + assert isinstance(tracker.iou_first, BIoU) + assert isinstance(tracker.iou_second, BIoU) + assert not hasattr(tracker, "iou_unconfirmed") + + def test_buffer_ratios_forwarded_to_biou(self) -> None: + tracker = CBIoUTracker( + buffer_ratio_first=0.1, + buffer_ratio_second=0.3, + ) + assert tracker.iou_first.buffer_ratio == pytest.approx(0.1) + assert tracker.iou_second.buffer_ratio == pytest.approx(0.3) + + def test_cmc_disabled(self) -> None: + tracker = CBIoUTracker() + assert tracker.enable_cmc is False + assert tracker.cmc is None + + def test_tracker_id(self) -> None: + assert CBIoUTracker.tracker_id == "cbiou" + + def test_invalid_buffer_ratio_raises(self) -> None: + with pytest.raises(ValueError, match="buffer_ratio"): + CBIoUTracker(buffer_ratio_first=-0.01) + + +class TestCBIoUFrameWarning: + def test_frame_triggers_warning(self) -> None: + tracker = CBIoUTracker() + with pytest.warns(UserWarning): + tracker.update(_detection((100.0, 100.0, 200.0, 200.0)), frame=_make_frame()) + + def test_no_warning_without_frame(self) -> None: + tracker = CBIoUTracker() + with warnings.catch_warnings(): + warnings.simplefilter("error") + tracker.update(_detection((100.0, 100.0, 200.0, 200.0))) + + +class TestCBIoUAssociationTolerance: + """BIoU should associate near-miss detections that plain IoU would miss.""" + + def test_near_miss_associated_with_buffer(self) -> None: + """ + A track initialized at box A, then a detection at box B just outside + should be associated by CBIoU (buffer expands boxes) but not by + BoTSORT with standard IoU (tight threshold). + + Box A: [0, 0, 100, 100] (100x100) + Box B: [110, 0, 210, 100] (gap of 10px = 10% of width) + With buffer_ratio=0.15 each side expands by 15px, so A becomes + [-15, -15, 115, 115] and B becomes [95, -15, 225, 115] — + they now overlap. + """ + # Frame 1: spawn a track at box A with high confidence + cbiou = CBIoUTracker( + buffer_ratio_first=0.15, + minimum_consecutive_frames=1, + track_activation_threshold=0.5, + minimum_iou_threshold_first_assoc=0.05, + ) + botsort = BoTSORTTracker( + enable_cmc=False, + minimum_consecutive_frames=1, + track_activation_threshold=0.5, + minimum_iou_threshold_first_assoc=0.05, + ) + + box_a = (0.0, 0.0, 100.0, 100.0) + box_b = (110.0, 0.0, 210.0, 100.0) + + cbiou.update(_detection(box_a)) + botsort.update(_detection(box_a)) + botsort_frame1_track_id = next((t.tracker_id for t in botsort.tracks), None) + + # Frame 2: detection slightly outside A — CBIoU buffer closes the gap + cbiou_result = cbiou.update(_detection(box_b)) + botsort_result = botsort.update(_detection(box_b)) + + assert cbiou_result.tracker_id is not None and len(cbiou_result.tracker_id) == 1 + assert cbiou_result.tracker_id[0] >= 0 + cbiou_frame1_id = cbiou.tracks[0].tracker_id + assert cbiou_result.tracker_id[0] == cbiou_frame1_id + + botsort_ids = botsort_result.tracker_id + if botsort_ids is not None and len(botsort_ids) > 0 and botsort_frame1_track_id is not None: + assert botsort_ids[0] != botsort_frame1_track_id + + +class TestCBIoUZeroBufferEquivalence: + """With buffer_ratio=0, BIoU recovers IoU; C-BIoU should match BoT-SORT (no CMC).""" + + def test_zero_buffer_matches_botsort_without_cmc(self) -> None: + detections = [ + _detection((0.0, 0.0, 50.0, 50.0)), + _detection((5.0, 5.0, 55.0, 55.0)), + _detection((100.0, 100.0, 150.0, 150.0)), + _detection((105.0, 105.0, 155.0, 155.0)), + _detection((8.0, 8.0, 58.0, 58.0)), + ] + + def run_tracker(tracker: CBIoUTracker | BoTSORTTracker) -> list[sv.Detections]: + tracker.reset() + return [tracker.update(det) for det in detections] + + cbiou = CBIoUTracker( + buffer_ratio_first=0.0, + buffer_ratio_second=0.0, + minimum_consecutive_frames=1, + track_activation_threshold=0.5, + minimum_iou_threshold_first_assoc=0.3, + minimum_iou_threshold_second_assoc=0.3, + minimum_iou_threshold_unconfirmed_assoc=0.3, + high_conf_det_threshold=0.6, + ) + botsort = BoTSORTTracker( + enable_cmc=False, + minimum_consecutive_frames=1, + track_activation_threshold=0.5, + minimum_iou_threshold_first_assoc=0.3, + minimum_iou_threshold_second_assoc=0.3, + minimum_iou_threshold_unconfirmed_assoc=0.3, + high_conf_det_threshold=0.6, + ) + + cbiou_results = run_tracker(cbiou) + botsort_results = run_tracker(botsort) + + for frame_idx, (r_cbiou, r_botsort) in enumerate(zip(cbiou_results, botsort_results)): + assert len(r_cbiou) == len(r_botsort), ( + f"frame {frame_idx}: CBIoU(buffer=0) and BoTSORT(no CMC) returned different " + f"detection counts ({len(r_cbiou)} vs {len(r_botsort)})" + ) + np.testing.assert_array_equal( + r_cbiou.tracker_id, + r_botsort.tracker_id, + err_msg=f"frame {frame_idx}: different tracker IDs", + ) + if len(r_cbiou) > 0: + np.testing.assert_allclose( + r_cbiou.xyxy, + r_botsort.xyxy, + err_msg=f"frame {frame_idx}: different boxes", + ) + + +class TestCBIoUSearchSpace: + def test_cascade_buffer_params_in_search_space(self) -> None: + ss = CBIoUTracker.search_space + assert "buffer_ratio_first" in ss + assert "buffer_ratio_second" in ss + assert "buffer_ratio_unconfirmed" not in ss + + def test_no_cmc_in_search_space(self) -> None: + ss = CBIoUTracker.search_space + assert "enable_cmc" not in ss + assert "cmc_method" not in ss diff --git a/tests/core/test_registration.py b/tests/core/test_registration.py index 4299f704..775e3ada 100644 --- a/tests/core/test_registration.py +++ b/tests/core/test_registration.py @@ -236,6 +236,7 @@ def test_tracker_is_registered(self, tracker_id: str) -> None: from trackers import ( # noqa: F401 BoTSORTTracker, ByteTrackTracker, + CBIoUTracker, OCSORTTracker, SORTTracker, ) @@ -247,6 +248,7 @@ def test_lookup_tracker(self, tracker_id: str) -> None: from trackers import ( # noqa: F401 BoTSORTTracker, ByteTrackTracker, + CBIoUTracker, OCSORTTracker, SORTTracker, ) @@ -265,6 +267,7 @@ def test_registered_trackers_returns_sorted_list(self) -> None: from trackers import ( # noqa: F401 BoTSORTTracker, ByteTrackTracker, + CBIoUTracker, OCSORTTracker, SORTTracker, ) @@ -276,6 +279,8 @@ def test_registered_trackers_returns_sorted_list(self) -> None: @pytest.mark.parametrize("tracker_id", ALL_TRACKER_IDS) def test_tracker_params_have_descriptions(self, tracker_id: str) -> None: + from trackers import CBIoUTracker # noqa: F401 + info = BaseTracker._lookup_tracker(tracker_id) assert info is not None @@ -291,6 +296,7 @@ def test_search_space_keys_match_init_params(self) -> None: from trackers import ( BoTSORTTracker, ByteTrackTracker, + CBIoUTracker, OCSORTTracker, SORTTracker, ) @@ -300,6 +306,7 @@ def test_search_space_keys_match_init_params(self) -> None: SORTTracker, OCSORTTracker, BoTSORTTracker, + CBIoUTracker, ): init_params = set(inspect.signature(tracker_cls.__init__).parameters) - {"self"} for key in tracker_cls.search_space: diff --git a/tests/core/test_trackers.py b/tests/core/test_trackers.py index 68964395..92654628 100644 --- a/tests/core/test_trackers.py +++ b/tests/core/test_trackers.py @@ -30,7 +30,7 @@ from trackers.core.sort.tracker import SORTTracker from trackers.utils.iou import BaseIoU -from .shared_ids import ALL_TRACKER_IDS +from .shared_ids import ALL_TRACKER_IDS, IOU_TRACKER_IDS # --------------------------------------------------------------------------- # Helpers @@ -119,7 +119,7 @@ def test_tracker_update_empty_does_not_mutate_input(tracker_id: str) -> None: assert result is not dets, "update() must return a new sv.Detections instance" -@pytest.mark.parametrize("tracker_id", ALL_TRACKER_IDS) +@pytest.mark.parametrize("tracker_id", IOU_TRACKER_IDS) def test_tracker_uses_configured_iou_variant(tracker_id: str) -> None: """Trackers should use the configured IoU implementation for matching.""" tracking_iou = _TrackingIoU() @@ -143,6 +143,7 @@ def test_no_confidence_detections_can_spawn_confirmed_tracks(tracker_id: str) -> raise AssertionError(f"{tracker_id} did not confirm any track for confidence=None detections") +@pytest.mark.parametrize("tracker_id", ALL_TRACKER_IDS) @pytest.mark.parametrize( "xyxy_boxes", [ @@ -165,16 +166,15 @@ def test_no_confidence_detections_can_spawn_confirmed_tracks(tracker_id: str) -> ], ids=["single_box", "two_boxes", "three_boxes_non_overlapping"], ) -def test_bytetrack_no_confidence_matches_explicit_ones_confidence(xyxy_boxes: np.ndarray) -> None: - """ByteTrack treats confidence=None the same as all-ones across multi-box batches. +def test_no_confidence_matches_explicit_ones_confidence(tracker_id: str, xyxy_boxes: np.ndarray) -> None: + """Every tracker treats confidence=None the same as all-ones across multi-box batches. - The batched scenarios exercise the high/low split machinery in - `ByteTrackTracker.update()` that single-box equivalence cannot trigger; a - regression that mis-buckets `confidence=None` in a multi-detection batch - would still pass single-box equality but would diverge here. + Multi-detection batches exercise confidence bucketing in trackers that split + high/low detections; a regression that mis-buckets ``confidence=None`` would + still pass single-box equality but diverge here. """ - no_confidence_tracker = ByteTrackTracker(minimum_consecutive_frames=1) - explicit_confidence_tracker = ByteTrackTracker(minimum_consecutive_frames=1) + no_confidence_tracker = _instantiate(tracker_id, minimum_consecutive_frames=1) + explicit_confidence_tracker = _instantiate(tracker_id, minimum_consecutive_frames=1) class_ids = np.zeros(len(xyxy_boxes), dtype=int) detection_without_confidence = sv.Detections(xyxy=xyxy_boxes.copy(), class_id=class_ids.copy()) detection_with_ones_confidence = sv.Detections( diff --git a/tests/data/tracker_expected_dancetrack.json b/tests/data/tracker_expected_dancetrack.json index 6b9f8cdf..1b527958 100644 --- a/tests/data/tracker_expected_dancetrack.json +++ b/tests/data/tracker_expected_dancetrack.json @@ -22,5 +22,11 @@ "MOTA": 99.605, "IDF1": 76.841, "IDSW": 608 + }, + "cbiou": { + "HOTA": 80.156, + "MOTA": 99.623, + "IDF1": 77.198, + "IDSW": 614 } } diff --git a/tests/data/tracker_expected_sportsmot.json b/tests/data/tracker_expected_sportsmot.json index f2cbda21..42afbf2a 100644 --- a/tests/data/tracker_expected_sportsmot.json +++ b/tests/data/tracker_expected_sportsmot.json @@ -22,5 +22,11 @@ "MOTA": 98.884, "IDF1": 80.626, "IDSW": 983 + }, + "cbiou": { + "HOTA": 87.386, + "MOTA": 99.547, + "IDF1": 82.762, + "IDSW": 605 } }