|Submission time (UTC)||Aug. 19, 2020, 7:25 p.m.|
|Task||6D localization of seen objects|
|Training model type||Default|
|Training image type||Synthetic + real|
|Publication||Li et al.: CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation, ICCV 2019|
|Training image modalities||RGB|
|Test image modalities||RGB-D|
In this setting, the models are trained in the same manner with the RGB track in BOP19 challenge and tested with depth/ICP refinement. Concretely, for LMO, HB, ICBIN and ITODD datasets, we only use the provided synthetic training data (PBR) in training. While for YCBV, TUDL, TLESS, we use the provided real data and synthetic data (PBR) in training. For each dataset, we trained a CDPN model for each object.
For detection, different from CDPN in BOP19, we used the FCOS with BackBone of vovnet-V2-57-FPN . We trained a detector for each dataset. The detector was trained for 8 epochs with batch size of 4 on a single GPU, 4 workers, and a learning rate of 1e-3. We used color augmentation similar to AAE  during training.
For pose estimation, the difference between our CDPNv2 and the BOP19-version CDPN mainly includes:
|Computer specifications||Intel i7-7700; GPU: GTX 1070; Memory: 16G|