Monet: Reasoning in Latent Visual Space Beyond Images and Language

Published in CVPR 2026, 2025

Code: GitHub

Recommended citation: Qixun Wang, Yang Shi, Yifei Wang, Yuanxing Zhang, Pengfei Wan, Kun Gai, Xianghua Ying, Yisen Wang. (2025). "Monet: Reasoning in Latent Visual Space Beyond Images and Language." CVPR 2026.
Download Paper