week 6.
1500-3000 basic-level nouns, ~10 types per basic-level category

Alternative explanation (Perona): ~1000 names per domain (broad scene category), 20-30 domains

Scene categorization or classification
Image annotation / tagging / attributes
street, people, building, mountain, tourism, cloudy, brick, …

Scene understanding?
Story: caption -> short description
Image parsing / semantic segmentation
Object detection


Category:
–Find all the people
–Find all the buildings
–Often within a single image
–Often ‘sliding window’

Instance:
–Is this face James?
–Find this specific famous building
–Often within a database of images

Variability:

High-dimensional space

Alignment: fitting a model to a transformation between pairs of features (matches) in two images

Representing and recognizing object categories is harder...
