Call for papers

Previous workshops: 2018, 2019, 2020, 2021, 2022, 2023

We're inviting submissions! If you're interested in potentially presenting a poster or giving a talk, please submit a short paper (or extended abstract) to CMT by April 11 at 11:59 PST, using this LaTeX template. We will notify authors about paper decisions by April 30. We encourage submissions for work that has already been accepted in other venues, as well as new, work-in-progress submissions. The paper must be at most 4 pages, including references (a 1- or 2-page extended abstract is also fine). Accepted papers will appear on this site. Since the papers in this workshop are at most 4 pages long, they can also be submitted to next year's CVPR. Submissions should be anonymous.

We are looking for work that involves vision and sound. For example, the following topics would be in scope:
  • Audio-visual self-supervised learning
  • Embodied audio-visual learning
  • Intuitive physics with sound
  • Audio-visual scene understanding
  • Sound-from-vision and vision-from-sound
  • Semi-supervised learning
  • Audio-visual navigation
  • Video-to-music alignment
  • Video editing and movie trailer generation
  • Material recognition
  • Vision-inspired audio convolutional networks
  • Sound localization
  • Audio-visual speech processing
  • Multimodal architectures

Organizers


Andrew Owens
University of Michigan

Jiajun Wu
Stanford

Arsha Nagrani
Google

Triantafyllos Afouras
Oxford

Ruohan Gao
Stanford

Hang Zhao
Tsinghua

Ziyang Chen
University of Michigan


William Freeman
MIT/Google

Andrew Zisserman
Oxford

Kristen Grauman
UT Austin / Meta

Antonio Torralba
MIT

Jean-Charles Bazin
Meta