Method: CRT-6D

User PCastro
Publication CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers, WACV2023
Training image modalities RGB
Test image modalities RGB

Preprint available on arxiv

  • We train a single model for all objects in a dataset with a ResNet34 backbone.
  • Training uses a symmetry-aware loss function.
  • All datasets are trained for 250k iterations with the Ranger optimizer and with cosine annealing starting at 85% of training iterations. Batchsize of 32 is used. Refinement modules start being optimized at 20% of training iterations.
  • Augmentations: color jittering, blur, noise, in-plane rotations and background/foreground replacement cropping.
  • We make use of available real images when available except for TLESS.
  • Standard detections from CosyPose are used.
  • Time measurements refer to the average time for pose estimation + refinement of all objects in an image. Differences between datasets are due to the different number of objects present in the image. Time measurements do not take into account the detection stage.
Computer specifications GTX 1080Ti

Public submissions

Date Submission name Dataset
2022-10-07 06:13 - T-LESS
2022-10-07 08:30 - YCB-V
2022-10-07 08:31 - TUD-L
2022-10-12 09:30 - LM-O
2022-10-13 02:57 - HB
2022-10-13 11:43 - IC-BIN
2022-10-13 14:46 - ITODD