|Submission time (UTC)||Aug. 17, 2020, 8:04 p.m.|
|Task||6D localization of seen objects|
|Training model type||Default|
|Training image type||Synthetic (only PBR images provided for BOP Challenge 2020 were used)|
|Publication||Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation, ICCV 2019|
|Training image modalities||RGB|
|Test image modalities||RGB-D|
Poses are initially estimated with Pix2Pose using RGB images only, and the ICP refinement is performed for each initial prediction. Results are derived after the following modifications from the original implementation of the paper.
1) Replaced the encoder part with the first three blocks of Resnet-50 with pre-trained weights using ImageNet.
2) Increased a threshold for inlier pixels during PnP-Ransac operation (3 -> 5).
3) Detection results from Mask-RCNN are reused if predictions for each detection are not successful. In this case, Pix2Pose is performed for other objects that do not have good results yet.
4) Parameters for the ICP refinement are optimized.
5) Adjusted inlier and outlier thresholds for Pix2Pose (inlier: 0.15 -> 0.2, outlier: [0.15,0.25,0.35] -> [0.2,0.3,0.35]).
6) A minor bug that causes bad detection results for the T-Less dataset is fixed. (different image resolutions were used during training and inference)
7) Increased the number of RPN proposals and NMS thresholds in Mask-RCNN (1000/0.7 to 2000/0.9), which produces more detection proposals
8) A score of each hypothesis is computed by a new form, max(0,0.2-[depth_difference per pixel])/0.2, instead of counting the number of pixels that have less than 0.2 depth differences.
All updates will be shared in our public repository (checkout bop2020 branch after the deadline)
|Computer specifications||CPU: i7-9700K, GPU: Titan V, RAM: 32GB|