1 Why Mem + AE

  1. training on the normal data, AE is expected to produce higher reconstruction error for the abnormal inputs than the normal ones ⇒ Anomaly detection.

  2. The assumption does not always hold in practice.

    Untitled

  3. ⇒ Memory-augmented AE

2 How

Untitled

3 Training

training only with normal samples

Untitled

Memory sizes N for MNIST and CIFAR-10 are set as 100 and 500, respectively.

video dataset UCSD-Ped2: N=500 is not enough, N=1000 to 3000 ⇒ same accurancy.

3.1 Attention for Memory Addressing

Untitled

w is defined by the similarity between z and mi.

3.2 Hard Shrinkage for Sparse Addressing

a complex combination of the memory items via a dense w may ⇒ well reconstructed anomaly, so

a hard shrinkage operation to promote the sparsity of w:

Untitled

λ as a value in the interval [1=N; 3=N]

3.3 How to learn M