https://github.com/EricLee0224/PAD

  1. an open-source benchmark library: provide 10 state-of-the-art methods from 8 distinct paradigms
  2. MAD (MAD-Sim and MAD-Real), using 20 complex-shaped LEGO toys, including 4K views with various poses
    1. 11,000 images from different viewpoints covering a wide range of poses.
      1. #Normal: 5,231, #Abnormal4,902
    2. three types of anomalies that was carefully selected to reflect both the simulated and real-world environments.
    3. MAD-Real: actually 40-50 views per LEGO toy ⇒ poor NeRF & GS

Untitled

  1. three components:
    1. build NeRF
    2. two-stage pose estimation process, from coarse to fine
      1. image retrieval by descriptor matching: query image Iq ⇒ 64 dim feature vector via EfficientNet-B4 model[43]
        1. concatenation of feature from layer1 to layer5 ⇒ 64???
      2. pose refined using the iNeRF[55] ⇒ reference image I^ & its pose.
  2. Anomaly detection and localization: bottom right box
    1. feature from layer1 to layer5 are resized to to the same size as the query image, then concatenate to a feature map: fq, f^.
    2. ⇒ feature difference maps d(u) = fq(u) - f^(u)
    3. ⇒ score map s(u)= ||d(u)||_{2}, i.e. L2 norm feature difference
    4. locked CNN: EfficientNet-B4
    5. u is the spatial position index (height together with width for simplicity)

Experiments

tab 2 比的方法都老旧。

Table 3: Dense-view to sparse-view object anomaly detection results on MAD-Real

  1. PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection, nips23
  2. [55] iNeRF, IROS21