BOP: Benchmark for 6D Object Pose Estimation

Method: gfreedet2-6d-default2d

User	gfreedet
Publication	Not yet
Implementation
Views	Single
Test image modalities	RGB
Description	Training data: None Onboarding data: Model-free: using static onboarding sequences to reconstruct 3DGS models, rendering templates for 2D detection (162 rendered images + 64 sampled static onboarding images) and coarse pose estimation (~800 rendered images). The average onboarding time for reconstructing a GS object and generating its templates/descriptors for 6D coarse detection is about 270s (We use the provided default 2D detection). Notes: For unified 3DGS reconstruction from pinhole and fisheye images, we use an adaptive perspective cropping strategy to preprocess static onboarding images. Then the object Gaussians are rapidly trained with these cropped pinhole images for 10k iterations. With the obtained GS models, we prepare templates as described above. For 2D detection, we use the provided default model-free CNOS (FastSAM) - Static onboarding. For coarse 6D pose detection, we extend FoundPose to support the model-free setting by using the templates rendered from 3DGS. We further extend FoundPose to support correct and unified perspective cropping for pinhole/fisheye query images. For fine 6D pose detection, we extend GoTrack (which estimates pose via render-to-observation flow and PnP/RANSAC) to support the model-free setting by leveraging the gsplat renderer. Authors: Temporary Anonymity
Computer specifications	NVIDIA L20