BOP: Benchmark for 6D Object Pose Estimation

Submission: PointVoteNet2/IC-BIN/ICBIN-PVN2-E6

Download submission

Submission name

ICBIN-PVN2-E6

Submission time (UTC)

Aug. 19, 2020, 10:58 a.m.

User

frederikhagelskjaer

Task

Model-based 6D localization of seen objects

Dataset

IC-BIN

Training model type

Default

Training image type

Synthetic (only PBR images provided for BOP Challenge 2020 were used)

Description

As multiple objects are present a non-maximum-suppression is applied to the pose estimations, which are scored with a combined depth and contour gradient score. The radius for the non-maximum-suppression is half the object radius.

Evaluation scores

AR:	0.264
AR_MSPD:	0.194
AR_MSSD:	0.217
AR_VSD:	0.381
average_time_per_image:	-1.000

Method: PointVoteNet2

User	frederikhagelskjaer
Publication	PointVoteNet: Accurate Object Detection and 6 DoF Pose Estimation in Point Clouds
Implementation
Training image modalities	RGB-D
Test image modalities	RGB-D
Description	The method is a modification of the method described in the paper. A new addition is that background segmentation, and feature segmentation is split into two predictions. Two DGCNN kNN layers are also used instead of the original PointNet network. As in the original paper, for each point not predicted as background, the highest-scoring feature prediction is used; however, all feature predictions of 95% within the highest score are also allowed to accommodate symmetric objects. As in the publication, the detection and pose estimation is performed in 3D, while a final pose verification using 2D images is also used. For training, only the provided synthetic BlenderProc4BOP images were used. Approximately one-fourth of the scenes are used for the training. The data is converted to Point Clouds, and spheres are extracted as in the publication. No further augmentations are performed besides random noise added during the training. Each object takes two hours to train. As a result of time constraints, not all datasets were appropriately trained, and thus subpar performance was obtained.
Computer specifications	The networks are trained on a Tesla P100-PCIE (Google Colab) and testing is performed on a GeForce RTX 2080 with an Intel(R) Core(TM) i9-9820X CPU @ 3.30GHz.