cv2026 | Notion

Focus on the real problems

embodied intelligence: https://www.vincentsitzmann.com/blog/bitter_lesson_of_cv/
1. In the past, learning perception-action loops directly was intractable.
2. computer vision: map images to intermediate representations for perception.
3. robot learning and control: ingest these specific representations—point clouds, bounding boxes, and masks—and map them to actions.
4. This factorization was a necessary compromise for the time.
5. The historical boundaries between computer vision, robot learning, and control will dissolve. Frontier research will no longer draw a boundary between “seeing” and “learning to act.”
  1. Tesla;
human-machine interface
1. What about engineering tasks such as architecture, CAD, or manufacturing? Surely, we need explicit 3D representations to build a house or 3D-print an engine part.
2. tasks just for perception: Digital cultural relics, Digital watermark, …
3. In the very long run, also should be done end-to-end.

https://hanlab.mit.edu/courses/2023-fall-65940 好课。

这项研究的作者共有四位，其中一位是深度强化学习大牛、UC 伯克利教授 Pieter Abbeel 。Abbeel 在业余时间还出了很多课程，其中 Intro to AI 课程在 edX 上吸引了 10 万多名学生学习，他的深度强化学习和深度无监督学习教材是 AI 研究者的经典学习资料，包括 CS294-158（Deep Unsupervised Learning）、CS188（Introduction to Artificial Intelligence）、CS287（Advanced Robotics）等。

Research report topics - 26

01_basics

02_features and matching

03_machine_Learning

04_Recognition_Detection

05_motion & tracking

06_Sensors

07 Segmentation

08 Object Detection

09 Anomaly detection & FOD