Submission: CRT-6D/YCB-V

Download submission
Submission name
Submission time (UTC) Oct. 7, 2022, 8:30 a.m.
User PCastro
Task 6D localization of seen objects
Dataset YCB-V
Training model type None
Training image type Synthetic + real
Description
Evaluation scores
AR:0.752
AR_MSPD:0.774
AR_MSSD:0.776
AR_VSD:0.706
average_time_per_image:0.028

Method: CRT-6D

User PCastro
Publication CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers, WACV2023
Implementation https://github.com/PedroCastro/CRT-6D
Training image modalities RGB
Test image modalities RGB
Description

Preprint available on arxiv

  • We train a single model for all objects in a dataset with a ResNet34 backbone.
  • Training uses a symmetry-aware loss function.
  • All datasets are trained for 250k iterations with the Ranger optimizer and with cosine annealing starting at 85% of training iterations. Batchsize of 32 is used. Refinement modules start being optimized at 20% of training iterations.
  • Augmentations: color jittering, blur, noise, in-plane rotations and background/foreground replacement cropping.
  • We make use of available real images when available except for TLESS.
  • Standard detections from CosyPose are used.
  • Time measurements refer to the average time for pose estimation + refinement of all objects in an image. Differences between datasets are due to the different number of objects present in the image. Time measurements do not take into account the detection stage.
Computer specifications GTX 1080Ti