Submission: WAPR.v2(MUSE)/TUD-L

Download submission
Submission name
Submission time (UTC) Nov. 2, 2025, 12:16 p.m.
User SEU_WYL
Task Model-based 6D detection of unseen objects
Dataset TUD-L
Description
Evaluation scores
AP:0.965
AP_25:0.968
AP_25_mm:0.492
AP_MSPD:0.961
AP_MSSD:0.968
AP_MSSD_mm:0.492
average_time_per_image:0.408

Method: WAPR.v2(MUSE)

User SEU_WYL
Publication
Implementation
Training image modalities RGB-D
Test image modalities RGB-D
Description

WAPR.v2 adopts the same zero-shot 2D detector setting as FRTPose-WAPR.v2. Unlike FRTPose-WAPR.v2, WAPR.v2 removes the FRTPose component, making it a fully zero-shot method. This modification increases inference speed but results in a slight reduction in accuracy. Similar to FRTPose-WAPR.v2, WAPR.v2 initializes 12/16 uniformly sampled candidate poses for each detected object and applies the WAPR module to refine each pose five times. The WAPR module supports wide-angle pose refinement, allowing initialization errors of up to ±90°. Finally, the refined poses are scored by the FoundationPose pose scoring network, and the pose with the highest score is selected as the final result.

For a task involving 30 different objects and an image containing 100 detected 2D bounding boxes, the total computational workload can be expressed as 30 (objects) × 100 (detections) × 24 (candidate poses) × (5 (refinements) + 1 (scoring)) = 432,000. These 432,000 parallel operations represent the overall computation scale rather than sequential iterations. Under this configuration, the average computation time per image corresponds to the time required for handling this parallel workload. If the 2D detection confidence scores and classifications are considered reliable, a coarse filtering of low-confidence bounding boxes can be applied. This significantly reduces the number of candidate detections, thereby reducing the average computation time per image to less than one second.

Computer specifications