BOP: Benchmark for 6D Object Pose Estimation

Submission: CRT-6D/YCB-V

Download submission

Submission name

Submission time (UTC)

Oct. 7, 2022, 8:30 a.m.

User

PCastro

Task

Model-based 6D localization of seen objects

Dataset

YCB-V

Training model type

None

Training image type

Synthetic + real

Description

Evaluation scores

AR:	0.752
AR_MSPD:	0.774
AR_MSSD:	0.776
AR_VSD:	0.706
average_time_per_image:	0.028

Method: CRT-6D

User	PCastro
Publication	CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers, WACV2023
Implementation	https://github.com/PedroCastro/CRT-6D
Training image modalities	RGB
Test image modalities	RGB
Description	Preprint available on arxiv We train a single model for all objects in a dataset with a ResNet34 backbone. Training uses a symmetry-aware loss function. All datasets are trained for 250k iterations with the Ranger optimizer and with cosine annealing starting at 85% of training iterations. Batchsize of 32 is used. Refinement modules start being optimized at 20% of training iterations. Augmentations: color jittering, blur, noise, in-plane rotations and background/foreground replacement cropping. We make use of available real images when available except for TLESS. Standard detections from CosyPose are used. Time measurements refer to the average time for pose estimation + refinement of all objects in an image. Differences between datasets are due to the different number of objects present in the image. Time measurements do not take into account the detection stage.
Computer specifications	GTX 1080Ti