Submission: AlignPose (FoundPose+FeatRef+Megapose) (3PT)/IPD

Download submission
Submission name
Submission time (UTC) Jan. 29, 2026, 4:18 p.m.
User alignpose
Task Model-based 6D detection of unseen objects
Dataset IPD
Description The upstream 3PT-Detection detector produces a large number of 2D detections per scene. Since our method refines each individually, the runtime scales with detection count. To reduce computation, we used the provided target object IDs to filter out detections of non-target objects. This does not affect AP scores, but it decreases the runtime, which would result in unfair comparison to other methods. We therefore do not report time, though it exceeds 200s due to large number of detections.
Evaluation scores
AP:0.798
AP_25:0.743
AP_25_mm:0.582
AP_MSPD:0.852
AP_MSSD:0.743
AP_MSSD_mm:0.582
average_time_per_image:-1.000

Method: AlignPose (FoundPose+FeatRef+Megapose) (3PT)

User alignpose
Publication
Implementation
Training image modalities None
Test image modalities RGB
Description

Detections: 3PT
(run for all available views)

Single-view: FoundPose + FeatRef + Megapose
(run for all available views)

Multi-view: AlignPose

The presented results were obtained by the AlignPose [1] multi-view pipeline. Each view is first processed independently using 2D detections from 3PT-Detection and SAM2 [2] segmentations. Initial pose estimates are obtained for each view with single-view method FoundPose [3] and refined with FoundPose featuremetric refinement and MegaPose [4] refinement. Multi-view consistent poses are produced with AlignPose pipeline that aggregates all single-view candidates with Non Maximal Suppression and refines them with multi-view feature-metric refinement.

[1] Anonymous: AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment
[2] Ravi, Nikhila, et al. "SAM 2: Segment Anything in Images and Videos." The Thirteenth International Conference on Learning Representations, 2025.
[3] Örnek, Evin Pınar, et al. "FoundPose: Unseen Object Pose Estimation with Foundation Features." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024.
[4] Labbé, Yann, et al. "MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare." Conference on Robot Learning, 2022.

Computer specifications