Submission: gfreedet2-2d_fastsam/HOPEv2

Download submission
Submission name
Submission time (UTC) Sept. 1, 2025, 9:35 a.m.
User gfreedet
Task Model-free 2D detection of unseen objects
Dataset HOPEv2
Description
Evaluation scores
AP:0.369
AP50:0.561
AP75:0.392
AP_large:0.426
AP_medium:0.034
AP_small:0.089
AR1:0.409
AR10:0.470
AR100:0.470
AR_large:0.538
AR_medium:0.090
AR_small:0.089
average_time_per_image:0.287

Method: gfreedet2-2d_fastsam

User gfreedet
Publication Not yet
Implementation
Training image modalities None
Test image modalities RGB
Description

Training data: None

Onboarding data:
Model-free: using static onboarding sequences to reconstruct 3DGS models, rendering templates for 2D detection (162 rendered images + 64 sampled static onboarding images). The average onboarding time for reconstructing a GS object and generating its templates/descriptors is about 105s.

Notes: For unified 3DGS reconstruction from pinhole and fisheye images, we use an adaptive perspective cropping strategy to preprocess static onboarding images. Then the object Gaussians are rapidly trained with these cropped pinhole images for 10k iterations. With the obtained GS models, we prepare templates as described above. For 2D detection, we use a modified CNOS augmented with appearance scores. The descriptor model is DINOv2, and the segmentor is FastSAM.

Authors: Temporary Anonymity

Computer specifications NVIDIA L20