Method: FreeZeV2 (SAM6D)

User andreacaraffa
Publication Caraffa et al: FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models, ECCV 2024
Implementation
Views single
Test image modalities RGB-D
Description

Submitted to: BOP Challenge 2024

Training data: We do not train models on 6D pose estimation data. We use two frozen models pre-trained on web-scale 2D images and 3D point clouds, respectively.

Onboarding data: We render 162 templates for each object. We compute visual features from the rendered images, we back-project them into 3D and aggregate them. We compute geometric features directly from the 3D models and estimate geometric symmetries using the Chamfer distance.

Used 3D models: CAD models for T-LESS, default models for the other datasets.

Notes: We do not use task-specific training. We leverage two pre-trained geometric and vision foundation models, i.e. GeDi [A] and DINOv2 [B] to generate 3D discriminative point-level descriptors. We estimate objects' 6D pose via 3D registration based on RANSAC followed by ICP refinement.

We use segmentation masks provided by SAM6D [C].

[A] Poiesi et al.: Learning general and distinctive 3D local deep descriptors for point cloud registration, IEEE PAMI 2023
[B] Oquab et al.: DINOv2: Learning robust visual features without supervision, arXiv 2023
[C] Lin et al.: SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation, CVPR 2024

Authors: Andrea Caraffa, Davide Boscaini, Amir Hamza and Fabio Poiesi

Computer specifications GPU A40; CPU Intel(R) Xeon(R) Silver 4316 @ 2.30GHz

Public submissions

Date Submission name Dataset
2024-09-17 15:51 - LM-O
2024-09-17 16:01 - TUD-L
2024-09-17 20:16 - IC-BIN
2024-09-17 20:17 - HB
2024-09-17 20:17 - T-LESS
2024-09-17 22:16 - YCB-V
2024-09-18 07:29 - ITODD
2024-09-19 14:41 SAR threshold = -1 LM-O
2024-09-19 14:42 SAR threshold = -1 TUD-L
2024-09-19 14:43 SAR threshold = -1 IC-BIN
2024-09-20 09:48 SAR threshold = -1 T-LESS
2024-09-20 09:48 SAR threshold = -1 ITODD
2024-09-20 09:48 SAR threshold = -1 YCB-V
2024-09-20 09:50 SAR threshold = -1 HB
2024-09-20 10:27 SAR threshold = -1 T-LESS