Datasets are split into groups which may differ in the type of included data and in tasks that can be evaluated on the datasets:
Group | Datasets | Content | Supported tasks |
---|---|---|---|
BOP-Classic |
BOP-Classic-Core: LM-O, T-LESS,
ITODD, HB, YCB-V, IC-BIN, TUD-L
(seven datasets used for BOP challenges since 2019)
BOP-Classic-Extra: LM, HOPEv1, RU-APC, IC-MI, TYO-L |
|
|
BOP-H3 |
HOT3D HOPEv2 HANDAL |
|
The datasets are provided on BOP HuggingFace Hub and in the BOP format. The BOP toolkit expects all datasets to be stored in the same folder, each dataset in a subfolder named with the base name of the dataset (e.g. "lm", "lmo", "tless"). The example below shows how to download and unpack one of the datasets (LM) from bash (names of archives with the other datasets can be seen in the download links below):
export SRC=https://huggingface.co/datasets/bop-benchmark/datasets/resolve/main wget $SRC/lm/lm_base.zip # Base archive with dataset info, camera parameters, etc. wget $SRC/lm/lm_models.zip # 3D object models. wget $SRC/lm/lm_test_all.zip # All test images ("_bop19" for a subset used in the BOP Challenge 2019/2020). wget $SRC/lm/lm_train_pbr.zip # PBR training images (rendered with BlenderProc4BOP). unzip lm_base.zip # Contains folder "lm". unzip lm_models.zip -d lm # Unpacks to "lm". unzip lm_test_all.zip -d lm # Unpacks to "lm". unzip lm_train_pbr.zip -d lm # Unpacks to "lm".
The datasets can be also downloaded using the HuggingFace CLI and unpacked using extract_bop.sh (more options are available at bop-benchmark):
pip install -U "huggingface_hub[cli]" export LOCAL_DIR=./datasets/ export NAME=lm huggingface-cli download bop-benchmark/datasets --include "$NAME/*" --local-dir $LOCAL_DIR --repo-type=dataset bash extract_bop.sh
MegaPose training dataset for Tasks on unseen objects of the 2023 and 2024 challenges.
Banerjee et al.: Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking, paper, project website, license agreement.
HOT3D is a dataset for egocentric hand and object tracking in 3D. The dataset offers multi-view, RGB/monochrome, fisheye image streams showing 19 subjects interacting with 33 diverse rigid objects. The dataset also offers comprehensive ground-truth annotations including 3D poses of objects, hands, and cameras, and 3D models of hands and objects. In addition to simple pick-up/observe/put-down actions, HOT3D contains scenarios resembling typical actions in a kitchen, office, and living room environment. The dataset is recorded by two head-mounted devices from Meta: Project Aria, a research prototype of light-weight AR/AI glasses, and Quest 3, a production VR headset sold in millions of units. Ground-truth poses were obtained by a professional motion-capture system using small optical markers attached to hands and objects. Hand annotations are provided in the UmeTrack and MANO formats and objects are represented by 3D meshes with PBR materials obtained by an in-house scanner. Recordings from Aria also include the eye gaze signal and scene point clouds.
In BOP, we use HOT3D-Clips, which is a curated subset of HOT3D. Each clip has 150 frames (5 seconds) which are all annotated with ground-truth poses of all modeled objects and hands and which passed visual inspection. There are 4117 clips in total, 2969 clips extracted from the training split and 1148 from the test split of HOT3D. The HOT3D-Clips subset is also used in the Multiview Egocentric Hand Tracking Challenge.
Ground-truth object annotations are publicly available for training images, and also for the first frames of dynamic object onboarding sequences and all frames of static object onboarding sequences (as defined in BOP Challenge 2024).
See the HOT3D whitepaper for details and the HOT3D Toolkit for documentation of data format and for Python utilities (loading, undistorting fisheye images, rendering using fisheye cameras, etc.).
HOT3D-Clips are distributed in a Webdataset-based format which can be converted to the BOP-scenewise format with a script from Anas Goudas. Anas also prepared a script to convert HOT3D object models (GLB, in meters) to the PLY format expected by the BOP toolkit (PLY, in millimeters). We also provide visualizations of ground-truth annotations for clips and for onboarding sequences: static (Quest 3), dynamic (Quest 3).
Tyree et al.: 6-DoF Pose Estimation of Household Objects for Robotic Manipulation: An Accessible Dataset and Benchmark, IROS 2022, project website, license: CC BY-SA 4.0.
28 toy grocery objects are captured in 50 scenes from 10 household/office environments. Up to 5 lighting variations are captured for each scene, including backlighting and angled direct lighting with cast shadows. Scenes are cluttered with varying levels of occlusion. The collection of toy objects is available from online retailers for about 50 USD (see "dataset_info.md" in the base archive for details).
We split the BlenderProc training images into three parts hope_train_pbr.zip (Part 1), hope_train_pbr.z01 (Part 2), hope_train_pbr.z02 (Part 3). Once downloaded, these files can be unzipped directly using “7z x hope_train_pbr.zip” or re-merged together using“zip -s0 hope_train_pbr.zip --out hope_train_pbr_all.zip”.
For the BOP Challenge 2024, we release an updated version of the dataset, called