Submission: RCVPose 3D_SingleModel_VIVO_PBR/IC-BIN

Submission name
Submission time (UTC) Oct. 16, 2022, 9:35 a.m.
User aaronwool
Task Pose estimation (BOP 2019-2022)
Dataset IC-BIN
Training model type Default
Training image type Synthetic (only PBR images provided for BOP Challenge 2020 were used)
Evaluation scores

Method: RCVPose 3D_SingleModel_VIVO_PBR

User aaronwool
Publication Yangzheng Wu, Alireza Javaheri, Mohsen Zand and Michael Greenspan: Keypoint Cascade Voting for Point Cloud Based 6DoF Pose Estimation, 3DV 2022.
Training image modalities RGB-D
Test image modalities RGB-D

A single model is trained for both semantic segmentation and pose estimation per dataset. Only provided PBR images are used for training. One model estimates all poses for all objects(MIMO) inside the scene of one dataset. The hyperparameters are consistent among all core datasets with a batch size of 8, an initial lr=1e-4, and an SGD optimizer. The implementation is mostly the same as described in the paper except the networks are extended to MIMO, i.e. estimating poses for all objects in the scene simultaneously in a single model. Three keypoints are used for each of the objects.

Computer specifications Validdation - CPU: Intel i7-11700F, GPU: RTX3090; Training - CPU: Intel(R) Xeon(R) Gold 5218, GPU: 8*RTX6000