BOP: Benchmark for 6D Object Pose Estimation

Method: CRT-6D

User	PCastro
Publication	CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers, WACV2023
Implementation	https://github.com/PedroCastro/CRT-6D
Views	single
Test image modalities	RGB
Description	Preprint available on arxiv We train a single model for all objects in a dataset with a ResNet34 backbone. Training uses a symmetry-aware loss function. All datasets are trained for 250k iterations with the Ranger optimizer and with cosine annealing starting at 85% of training iterations. Batchsize of 32 is used. Refinement modules start being optimized at 20% of training iterations. Augmentations: color jittering, blur, noise, in-plane rotations and background/foreground replacement cropping. We make use of available real images when available except for TLESS. Standard detections from CosyPose are used. Time measurements refer to the average time for pose estimation + refinement of all objects in an image. Differences between datasets are due to the different number of objects present in the image. Time measurements do not take into account the detection stage.
Computer specifications	GTX 1080Ti