没意思

Untitled

Towards segmenting any anomaly without training, we first construct a vanilla baseline (SAA) by prompting into a cascade of anomaly region generator (e.g., a prompt-guided object detection foundation model [23]) and anomaly region refiner (e.g., a segmentation foundation model [19]) modules via a naive class-agnostic language prompt (e.g., “Anomaly”). However, SAA shows the severe false-alarm problem, which falsely detects all the “wick” rather than the ground-truth anomaly region (the “overlong wick”). Thus, we further strengthen the regularization with hybrid prompts in the revamped model (SAA+), which successfully helps identify the anomaly regions.

2 Related work

Language or Vision Foundation Models

Prompt Engineering

adapting foundation models for downstream tasks

  1. prompting with text and visual inputs
    1. cannot be employed in ZSAS, because they require training data.
    2. 应该做一篇合成异常的试试。
  2. heuristic prompts [55] that do not require any training,

3 SAA: Baseline Method

vanilla Foundation Model Assembly for ZSAS

  1. R, S := SAA(I, T)
    1. T is a naive class-agnostic language prompt, e.g., “anomaly”, utilized in SAA
    2. R^B , S := Generator(I, T )
      1. Anomaly Region Generator, Grounding DINO [23]
      2. the bounding-box-level region set RB, and their corresponding confidence score set S
    3. R := Refiner(I, R^B)
      1. Anomaly Region Refiner, Segment Anything [19]

4 SAA+

Untitled

Foundation Model Adaption via Hybrid Prompt Regularization

hybrid prompts (PL, PP , PS , and PC)

  1. Prompt Generated from Domain Expert Knowledge
    1. PL = {Ta, Ts}: Language prompts
      1. Class-agnostic prompts (Ta), e.g., “anomaly” and “defect”.
      2. Class-specific prompts (Ts), e.g., “black hole” and “white bubble”
    2. PP = {θarea, θIoU}: property prompts
  2. Prompts Derived from Target Image Context
    1. PS: Anomaly Saliency as Prompt
    2. PC: Anomaly Confidence as Prompt
      1. identify the K candidates with the highest confidence scores based on the image content and use their average values for final anomaly region detection