We first use an object detector, Centernet, to detect objects in the image. Given the detected object bounding box, we crop the image and use PVNet to detect 2D object keypoints, which is used to compute 6D pose through the PnP algorithm.
We train one network per object for both CenterNet and PVNet only using the PBR synthetic data provided by bop challenge 2020 .
Differences between evaluated method and the linked publication
- We use a 2d detector to help PVNet handle multiple instances and reduce the domain gap between synthetic and real data.
- We use offset field instead of vector field.
- We use only the PBR synthetic data provided by bop challenge 2020 to train our model.