They can be seen as distribution based methods (fitting feature y with Gaussian distribution), while they use Gaussian distribution $z=f_{NF}(y)$.
DifferNet, wacv21: image level 1d normalizing flow, inaccurate location
CFLOW-AD, WACV22: patch level 1d normalizing flow, better location.
CL-FLow 23 uses information of multi-scale feature maps and improves DifferNet.
FastFlow, 21: image level 2d NF, end-to-end, faster, more economical
AltUB: Alternating Training Method to Update Base Distribution of Normalizing Flow for Anomaly Detection, 22
Is it enough to just learn from normal samples?

Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection (CVPR2023)

Multiscale for better location.