| User | IPT |
|---|---|
| Publication | Anonymous |
| Implementation | PyTorch |
| Views | Multi |
| Test image modalities | RGB |
| Description | IPT-Pose-Industrial: A Two Stage Transformer for Pose EstimationWe are submitting IPT-Pose-Industrial to the BOP Challenge 2025. This is a two stage foundation model, composed of an object detector (IPT-Detection) and a pose refinement network (IPT-Pose). IPT is a one-shot, image- and CAD-prompted object detection network. It employs a vision transformer backbone to simultaneously regress 2D bounding boxes, coarse object orientations, and object classifications. Initial poses are then estimated from IPT's outputs and passed to the pose refinement network which uses point-to-point correspondences across multiple views to refine the pose. Dataset and Training StrategyOur model is trained exclusively on large-scale synthetic datasets. This data is generated by rendering scenes in Blender, utilizing a diverse collection of over 100,000 unique CAD models collected from the public CAD model collections and other sources. The network is trained on a substantial dataset comprising over 500,000 synthetically rendered images to ensure robustness and generalization across a wide range of object instances and environmental conditions. Onboarding Procedure (less than 5 minutes per object)For each new CAD model, a set of reference templates is generated, showing the CAD model in various canonical orientations. These templates serve as a reference for the model during IPT's inference process. SpecificsWe use 4 views except in the case of IPD where we only use 3 views. We only use RGB from each view, so we are treating each sensor like it’s RGB-Only. Pose accuracy comes from pixel-level accurate multi-view pose refinement. Authors: Temporary Anonymity |
| Computer specifications | V100 |
| Date | Submission name | Dataset | ||
|---|---|---|---|---|
| 2025-10-01 20:56 | - | IPD | ||
| 2025-10-01 20:58 | - | XYZ-IBD | ||
| 2025-10-01 21:00 | - | ITODD-MV | ||
| 2025-10-01 21:05 | - | IPD | ||
| 2025-10-01 21:05 | - | XYZ-IBD | ||
| 2025-10-01 21:05 | - | ITODD-MV | ||
| 2025-10-01 23:49 | - | XYZ-IBD |