BOP: Benchmark for 6D Object Pose Estimation

Submission: RCVPose/LM/RCVPose(SISO)

Download submission

Submission name

RCVPose(SISO)

Submission time (UTC)

March 3, 2022, 4:21 p.m.

User

aaronwool

Task

Model-based 6D localization of seen objects

Dataset

Training model type

Default

Training image type

Synthetic (provided)

Description

The result is based on a SISO model(not VIVO). Paper accepted to ECCV 2022.

Evaluation scores

AR:	0.799
AR_MSPD:	0.832
AR_MSSD:	0.826
AR_VSD:	0.740
average_time_per_image:	-1.000

Method: RCVPose

User	aaronwool
Publication	ECCV 2022 Oral
Implementation	https://github.com/aaronWool/rcvpose.git
Training image modalities	RGB-D
Test image modalities	RGB-D
Description	We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for fewer, more disperse keypoints. The scheme is based upon the distance between points, which as a 1D quantity can be regressed more accurately than the 2D and 3D vector and offset quantities regressed in previous work, yielding more accurate keypoint localization. The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data, which is particularly effective at handling occlusions. A CNN is trained to estimate the distance between the 3D point corresponding to the depth mode of each RGB pixel, and a set of 3 disperse keypoints defined in the object frame. At inference, a sphere centered at each 3D point is generated, of radius equal to this estimated distance. The surfaces of these spheres vote to increment a 3D accumulator space, the peaks of which indicate keypoint locations. The proposed radial voting scheme is more accurate than previous vector or offset schemes, and is robust to disperse keypoints. Experiments demonstrate RCVPose to be highly accurate and competitive, achieving state-of-the-art results on the LINEMOD 99.7% and YCB-Video 97.2% datasets, notably scoring +4.9% higher 71.1% than previous methods on the challenging Occlusion LINEMOD dataset, and on average outperforming all other published results from the BOP benchmark for these 3 datasets.
Computer specifications	CPU: Intel i7-11700F, GPU: RTX3090