https://github.com/EricLee0224/PAD
- an open-source benchmark library: provide 10 state-of-the-art methods from 8 distinct paradigms
- MAD (MAD-Sim and MAD-Real), using 20 complex-shaped LEGO toys, including 4K views with various poses
- 11,000 images from different viewpoints covering a wide range of poses.
- #Normal: 5,231, #Abnormal4,902
- three types of anomalies that was carefully selected to reflect both the simulated and real-world environments.
- MAD-Real: actually 40-50 views per LEGO toy ⇒ poor NeRF & GS

- three components:
- build NeRF
- two-stage pose estimation process, from coarse to fine
- image retrieval by descriptor matching: query image Iq ⇒ 64 dim feature vector via EfficientNet-B4 model[43]
- concatenation of feature from layer1 to layer5 ⇒ 64???
- pose refined using the iNeRF[55] ⇒ reference image I^ & its pose.
- Anomaly detection and localization: bottom right box
- feature from layer1 to layer5 are resized to to the same size as the query image, then concatenate to a feature map: fq, f^.
- ⇒ feature difference maps d(u) = fq(u) - f^(u)
- ⇒ score map s(u)= ||d(u)||_{2}, i.e. L2 norm feature difference
- locked CNN: EfficientNet-B4
- u is the spatial position index (height together with width for simplicity)
Experiments
tab 2 比的方法都老旧。
Table 3: Dense-view to sparse-view object anomaly detection results on MAD-Real
- PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection, nips23
- [55] iNeRF, IROS21