Method: Pix2Pose-BOP-ICCV19

User kirumang
Publication Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation, ICCV 2019
Training image modalities RGB
Test image modalities RGB

Results are obtained after training the entire pipeline with fixed parameters for all datasets. Because of the modifications of parameters such as a fewer number of training iterations and the usage of real or synthetic images for training, the results can be different from what the paper reports.

As described in the paper, a detection method is used to provide 2D bounding boxes of all objects in a dataset. Differ from the paper, we used Mask-RCNN for 2D detection, which also provides a segmentation mask of each detected object. This mask is used to calculate the score by comparing the valid mask that is predicted by the pix2pose network. As we performed in the paper, a pix2pose network is trained for an object in the dataset. (e.g., for t-less, 1 mask R-CNN for 2D detection + 30 pix2pose networks for 6D pose estimation). No further refinement is applied.

For training, if real training images are provided, we used them. For datasets without real training images, we rendered images from uniformly sampled viewpoints that are defined in the t-less dataset (we referred to pose data in "train_render_reconst" of the t-less dataset).

Computer specifications i7-9700K / GTX 1070-Ti / RAM 32G

Public submissions

Date Submission name Dataset
2019-10-14 21:43 Basic HB
2019-10-14 21:43 Basic ITODD
2019-10-14 21:44 Basic T-LESS
2019-10-14 21:44 Basic YCB-V
2019-10-14 21:44 Basic IC-BIN
2019-10-14 21:43 Basic LM-O
2019-10-14 21:43 Basic TUD-L
2019-10-21 08:04 Basic RU-APC