https://github.com/EricLee0224/PAD

an open-source benchmark library: provide 10 state-of-the-art methods from 8 distinct paradigms
MAD (MAD-Sim and MAD-Real), using 20 complex-shaped LEGO toys, including 4K views with various poses
1. 11,000 images from different viewpoints covering a wide range of poses.
  1. #Normal: 5,231, #Abnormal4,902
2. three types of anomalies that was carefully selected to reflect both the simulated and real-world environments.
3. MAD-Real: actually 40-50 views per LEGO toy ⇒ poor NeRF & GS

Untitled

three components:
1. build NeRF
2. two-stage pose estimation process, from coarse to fine
  1. image retrieval by descriptor matching: query image Iq ⇒ 64 dim feature vector via EfficientNet-B4 model[43]
    1. concatenation of feature from layer1 to layer5 ⇒ 64???
  2. pose refined using the iNeRF[55] ⇒ reference image I^ & its pose.
Anomaly detection and localization: bottom right box
1. feature from layer1 to layer5 are resized to to the same size as the query image, then concatenate to a feature map: fq, f^.
2. ⇒ feature difference maps d(u) = fq(u) - f^(u)
3. ⇒ score map s(u)= ||d(u)||_{2}, i.e. L2 norm feature difference
4. locked CNN: EfficientNet-B4
5. u is the spatial position index (height together with width for simplicity)

Experiments

tab 2 比的方法都老旧。

Table 3: Dense-view to sparse-view object anomaly detection results on MAD-Real

PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection, nips23
[55] iNeRF, IROS21