|Publication||CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation|
|Training image modalities||RGB|
|Test image modalities||RGB|
In PBR-only setting, all models are trained only using the provided PBR synthetic data. For each dataset, we trained a CDPN model for each object.
For detection, different from CDPN in BOP19, we used the FCOS with BackBone of vovnet-V2-57-FPN . We trained a detector for each dataset. The detector was trained for 8 epochs with batch size of 4 on a single GPU, 4 workers, and a learning rate of 1e-3. We used color augmentation similar to AAE  during training.
For pose estimation, the difference between our CDPNv2 and the BOP19-version CDPN mainly including:
Besides the color augmentation similar to AAE , we also used the truncation domain randomization in  to improve the system robustness to occlusion.
Considering the organizer provides high-quality PBR synthetic training data in BOP20, we adopt a deeper 34-layer Resnet as the backbone instead of the 18-layer Resnet used in BOP19-version CDPN. Also, the fancy concat structures in BOP19-version CDPN are removed. The input and output resolutions are 256×256 and 64×64 respectively.
During training, the initial learning rate was 1 × 10−4 and the batch size was 6. We used RMSProp with alpha 0.99 and epsilon 1× 10−8 to optimize the network. The model was trained for 160 epochs in total and the learning rate was divided by 10 every 50 epochs
|Computer specifications||Intel i7-7700; GPU: GTX 1070; Memory: 16G|
|2020-08-19 04:45||Zhigang-CDPNv2 (MODE 2, FCOS)||T-LESS|
|2020-08-19 06:09||Zhigang-CDPNv2 (MODE 2, FCOS)||TUD-L|
|2020-08-19 06:45||Zhigang-CDPNv2 (MODE 2, FCOS)||YCB-V|
|2020-08-19 12:15||Zhigang-CDPNv2 (MODE 2, FCOS)||LM-O|
|2020-08-19 12:18||Zhigang-CDPNv2 (MODE 2, FCOS)||HB|
|2020-08-19 12:19||Zhigang-CDPNv2 (MODE 2, FCOS)||IC-BIN|
|2020-08-19 12:22||Zhigang-CDPNv2 (MODE 2, FCOS)||ITODD|