6周1.

1 Intro

1.1 CNNs features

Untitled

1.2 Features of Transformers

  1. CNN:
    1. feature = 空间滤波器(可视)
    2. feature = “我在找什么 pattern” pattern-centric
  2. Transformer:
    1. feature = 每个 token 的 embedding(向量)
    2. feature = “我如何聚合信息” relation-centric

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, iclr21

image.png

Do Vision Transformers See Like Convolutional Neural Networks, nips21

1.3 Dense Features

fig 3, 4 of DINOv3: https://arxiv.org/pdf/2508.10104

1.4 Feature Upsampling

https://paperswithcode.com/sota/feature-upsampling-on-imagenet?p=featup-a-model-agnostic-framework-for

Look the video of FeatUp, ICLR24

FeatUp, ICLR24