Overview

Visual Recognition and Search

EECS 6890 Topics in Information Processing (3 credits)

Spring 2013, Columbia University | Thursdays 7:00-8:50pm, 327 Seeley W. Mudd

Instructors: Rogerio Feris and Liangliang Cao {rsferis, liangliang.cao}@us.ibm.com

Home     |     Course Overview     |     Schedule     |     Resources     |     Projects     |    Presentations    |     Announcements

Course Overview


Visual data is exploding! 500 billion consumer photos are taken each year world-wide, 633 million photos taken per year in NYC alone. 120 new video-hours are uploaded on YouTube per minute. Billions of cell phones equipped with cameras exist today, with a forecast that video will account for 66 percent of global mobile data traffic by 2014. The United Kingdom has installed more than 4 million security cameras over the past decade. In the era of ‘big data’, major opportunities arise for systems that are capable of automatically analyzing, searching, and classifying content from images and videos. The goals of this course will be to understand the state-of-the-art computer vision approaches in the field of visual recognition and search, to actively analyze their strengths and weaknesses, and to identify interesting open questions and possible directions for future research. The course will consist of lectures given by the instructors and paper presentations by students. We will also ask students to implement cutting-edge techniques for visual recognition and search, as part of their term projects.

We will cover the following topics:

  • Low-level feature descriptors, feature coding, and pooling
  • Part-based and attribute-based representations
  • Hierarchical models (e.g., convolutional neural networks)
  • Large-Scale image classification and retrieval
  • Efficient object detection
  • Extensions to 3D and video
  • Applications

See the course schedule for details.


Prerequisites


Background in Computer Vision or Digital Image Processing is required. Programming skills in Matlab or C/C++ are also required. This is an advanced computer vision course for graduate students only.


Grading


Class Participation (10%) Paper Presentations (30%) Term Project (60%)


Projects


Projects may be done in teams of two or three, depending on the total number of students enrolled in the course. A list of project ideas will be provided by the instructors, with well-defined milestones and programming assignments. Students may propose their own project ideas, as long as a clear plan is provided and approved by the instructors. Each team will have to write a paper (4–8 pages) as a final technical report. The contribution of each team member will have to be clearly specified in the project presentation. Instructors will provide links for benchmark datasets as well as for publicly available state-of-the-art implementations (source code), which can serve as basis for projects. Check the project page for more information.


Late Policy


Late assignments will be penalized according to the following schedule:

0-24 hours 0.9
24-72 hours 0.7
72-168 hours 0.5
>1 week 0.0