Method: WAPR.v2(MUSE)

User SEU_WYL
Publication
Implementation
Views Single
Test image modalities RGB-D
Description

WAPR.v2 uses the same zero-shot 2D detector setting as FRTPose-WAPR.v2, but removes the FRTPose component, resulting in a fully zero-shot pipeline. This change improves inference speed while causing a slight reduction in accuracy. For each detected 2D bounding box, WAPR.v2 initializes 16 uniformly sampled pose hypotheses and refines each hypothesis for five iterations using the WAPR module. The refinement is wide-angle and tolerates initialization errors of up to ±90°. The refined hypotheses are then evaluated by the FoundationPose pose scoring network, and the highest-scoring pose is returned as the final prediction.

The computational cost can be approximated by the total number of pose operations. For a task with 30 objects and an image containing 100 detected 2D boxes, the workload is 30 (objects) × 100 (detections) × 16 (hypotheses) × (5 refinements + 1 scoring) = 288,000 pose operations. This value reflects the overall computation scale under parallel execution rather than sequential iterations. Therefore, the average runtime per image corresponds to the time required to process this parallel workload.

To further reduce runtime, two filtering stages are applied. First, low-confidence 2D detections are removed before pose refinement, directly reducing the number of candidate boxes. In the 6D detection setting, a fixed confidence threshold of 0.35 is used for all datasets and all objects. Moreover, all hyperparameters are kept identical across datasets and objects, as required by the BOP challenge setting, including the number of pose hypotheses, the refinement iterations, and the filtering thresholds. Under this standardized configuration, WAPR.v2 satisfies the runtime criterion of the 6D detection task, achieving an average runtime below one second per image. Second, coarse pruning is performed during the early refinement iterations to discard unlikely pose hypotheses, which reduces the total number of refinements executed in later stages.

Computer specifications

Public submissions

Date Submission name Dataset
2025-11-02 10:53 - T-LESS
2025-11-02 10:54 - TUD-L
2025-11-02 10:54 - LM-O
2025-11-02 10:54 - IC-BIN
2025-11-02 10:55 - ITODD
2025-11-02 10:55 - HB
2025-11-02 10:55 - YCB-V
2026-02-24 03:54 - T-LESS
2026-02-24 03:54 - TUD-L
2026-02-24 03:54 - LM-O
2026-02-24 03:55 - IC-BIN
2026-02-24 03:55 - ITODD
2026-02-24 03:56 - HB
2026-02-24 03:56 - YCB-V