Submission name | icpv4 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Submission time (UTC) | Aug. 19, 2020, 11:21 p.m. | ||||||||||
User | wangg16 | ||||||||||
Task | Model-based 6D localization of seen objects | ||||||||||
Dataset | IC-BIN | ||||||||||
Training model type | Default | ||||||||||
Training image type | Synthetic (only PBR images provided for BOP Challenge 2020 were used) | ||||||||||
Description | |||||||||||
Evaluation scores |
|
User | wangg16 |
---|---|
Publication | Li et al.: CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation, ICCV 2019 |
Implementation | https://github.com/LZGMatrix/BOP19_CDPN_2019ICCV/tree/bop2020 |
Training image modalities | RGB |
Test image modalities | RGB-D |
Description | In this setting, the models are trained in the same manner with the RGB track in BOP19 challenge and tested with depth/ICP refinement. Concretely, for LMO, HB, ICBIN and ITODD datasets, we only use the provided synthetic training data (PBR) in training. While for YCBV, TUDL, TLESS, we use the provided real data and synthetic data (PBR) in training. For each dataset, we trained a CDPN model for each object. For detection, different from CDPN in BOP19, we used the FCOS with BackBone of vovnet-V2-57-FPN [1]. We trained a detector for each dataset. The detector was trained for 8 epochs with batch size of 4 on a single GPU, 4 workers, and a learning rate of 1e-3. We used color augmentation similar to AAE [2] during training. For pose estimation, the difference between our CDPNv2 and the BOP19-version CDPN mainly includes:
[1] https://github.com/aim-uofa/AdelaiDet/tree/master/configs/FCOS-Detection/vovnet [2] https://github.com/DLR-RM/AugmentedAutoencoder |
Computer specifications | Intel i7-7700; GPU: GTX 1070; Memory: 16G |