1 Introduction

week 6.

1. How many visual object categories are there?

1500-3000 basic-level nouns, ~10 types per basic-level category

Untitled

Alternative explanation (Perona): ~1000 names per domain (broad scene category), 20-30 domains

Untitled

1.2 Specific recognition tasks

  1. Scene categorization or classification

    1. outdoor/indoor
    2. city/forest/factory/etc.
  2. Image annotation / tagging / attributes

    1. street, people, building, mountain, tourism, cloudy, brick, …

      Untitled

  3. Scene understanding?

    Story: caption -> short description

  4. Image parsing / semantic segmentation

  5. Object detection

Untitled

Untitled

1.3 Category vs. instance recognition

Category:

–Find all the people

–Find all the buildings

–Often within a single image

–Often ‘sliding window’

Untitled

Instance:

–Is this face James?

–Find this specific famous building

–Often within a database of images

Untitled

1.4 Recognition is all about modeling variability

Variability:

  1. Camera position
  2. Illumination
  3. Pose/shape parameters
  4. Within-class variations?

Untitled

                                                            High-dimensional space

Untitled

2 History of ideas in recognition

2.1 1960s – early 1990s: the geometric era

2.1.1 Shape is known

  1. Recognition as an alignment problem:
    1. Alignment: fitting a model to a transformation between pairs of features (matches) in two images

      Untitled

    2. Representing and recognizing object categories is harder...

      Untitled

      1. ACRONYM (Brooks and Binford, 1981), Binford (1971), Nevatia & Binford (1972), Marr & Nishihara (1978)

2.1.2 General shape primitives?