Theme: Model-based and model-free 2D/6D object detection on BOP-Classic, BOP-H3 and new BOP-Industrial datasets (XYZ-IBD, ITODD-MV, IPD).
BOP Challenge 2025 is associated with OpenCV Perception Challenge for Bin-picking, which offers $60,000 in prizes and is co-organized by OpenCV, Intrinsic, BOP, Orbbec, and University of Hawaii at Manoa. Phase 1 of the OpenCV challenge is evaluated on the IPD dataset which is also included in BOP Challenge 2025 – participants can submit the same IPD results to both challenges. |
BOP 2017-2024: To measure the progress in 6D object pose estimation and related tasks, we created the BOP benchmark in 2017 and have been organizing challenges on the benchmark datasets together with the R6D workshops since then. The field has come a long way, with the accuracy in model-based 6D localization of seen objects improving by more than 50% (from 56.9 to 86.0 AR) on the seven BOP core datasets. Since 2023 we evaluate a more practical yet more challenging task of model-based 6D localization of unseen objects, where new objects need to be onboarded just from their CAD models in max 5 min on 1 GPU. The best method for this task achieved in 2024 an impressive 82.5 AR. In 2024, we further defined a new 6D object detection task, model-free variant of all tasks, and introduced BOP-H3 datasets focused on AR/VR scenarios.
New in BOP 2025: We are introducing the BOP-Industrial datasets and the multi-view problem setup. See below for details.
In 2025, we are introducing a new BOP-Industrial group of datasets that are specifically focused on robotic bin picking. BOP-Industrial includes three datasets – XYZ-IBD from XYZ Robotics, ITODD from MVTec, IPD from Intrinsic – showing cluttered scenes with industrial objects that are recorded with different industrial sensors (grayscale, multi-view, structured-light depth).
Industrial settings often rely on multiple top-down views. To address this, we provide multi-view images in all BOP-Industrial datasets that enables a direct comparison of single- and multi-view approaches. By leveraging multi-view input images, participants can refine pose estimates and resolve pose ambiguities inherent in single-view approaches.
Deadlines for submitting results:
Challenge tracks on BOP-Classic-Core datasets:
Challenge tracks on BOP-H3 datasets:
Challenge tracks on BOP-Industrial datasets (single- and multi-view):
Awards for all tracks:
Extra awards for tracks on 6D detection and localization:
Extra awards for tracks on 6D detection on BOP-Industrial:
test_targets_bop24.json
).test_targets_multiview_bop25.json
).For track 1 on 6D localization, please follow instructions from BOP Challenge 2019.
For tracks 3, 5, 7, 9, 11 on 2D detection, please follow instructions from BOP Challenge 2022.
For tracks 2, 4, 6, 8, 10 on 6D detection, the only difference compared to 6D localization is that no information about the target objects in the test scenes is provided. That means participants can only use test_targets_bop24.json
(providing a list of test images) instead of test_targets_bop19.json
(providing a list of test images together with identities of object instances to localize).
In the BOP Challenge 2025, we additionally evaluate multi-view methods on BOP-Industrial datasets. Authors of multi-view methods can use views specified in the list im_ids
in the test_targets_multiview_bop25.json
files. The pose estimates are expected to be defined with respect to the camera associated with the first image listed in im_ids
. Single and multi-view methods are distinguished when creating a method and will be labeled in the leaderboards.
Results for all tracks should be submitted via this form. The online evaluation system uses the script eval_bop24_pose.py to evaluate 6D localization results, eval_bop24_pose.py to evaluate 6D detection results, and script eval_bop22_coco.py to evaluate 2D detection/segmentation results.
A detailed description of an example baseline that relies on YOLOv11 for object detection and a simple network for pose regression is provided on GitHub.
IMPORTANT: When creating a method, please document your method in the "Description" field using the this template:
*Submitted to:* BOP Challenge 2025 *Training data:* Type of images (e.g. real, provided PBR, custom synthetic, real + provided PBR), etc. *Onboarding data:* Only for Tracks 6 and 7; type of onboarding (static, dynamic, model-based), etc. *Notes:* Values of important method parameters, used 2D detection/segmentation method, details about the object representation built during onboarding, etc.
Martin Sundermeyer, Google
Junwen Huang, TU Munich
Médéric Fourmy, Czech Technical University in Prague
Van Nguyen Nguyen, ENPC ParisTech
Agastya Kalra, Intrinsic
Vahe Taamazyan, Intrinsic
Caryn Tran, Northwestern University
Stephen Tyree, NVIDIA
Anas Gouda, TU Dortmund
Jonathan Tremblay, NVIDIA
Eric Brachmann, Niantic
Bertram Drost, MVTec
Vincent Lepetit, ENPC ParisTech
Carsten Rother, Heidelberg University
Stan Birchfield, NVIDIA
Jiří Matas, Czech Technical University in Prague
Tomáš Hodaň, Reality Labs at Meta