Qixun Wang 王启迅

I am a third-year Ph.D. student in the School of Intelligence Science and Technology at Peking University.

My research interests include:

Multimodal Large Language Models (MLLMs), including latent visual reasoning, agentic visual reasoning, and evaluation of MLLMs.
Out-of-distribution (OOD) generalization, including theoretical analysis and algorithm design across computer vision, graph data, and the generalization behavior of LLMs.

News

July, 2026	One paper, Beacon, on improving reasoning-mode adaptiveness and achieving genuine tool-induced performance gains in agentic visual reasoning, was released on arXiv.
May, 2026	A paper entitled Artifact-Bench, which evaluates MLLMs for AI-generated video detection, was released on arXiv.
May, 2026	Our paper Semantic-Enriched Latent Visual Reasoning was accepted to ICML 2026.
March, 2026	I joined the Kling Team at Kuaishou Technology as a research intern.
February, 2026	Two papers were accepted to CVPR 2026, covering reasoning in latent visual space and a benchmark for unified multimodal models.

Selected papers (see full publication)

arXiv
Beacon: Knowing When and Why to Perform Agentic Visual Reasoning
Qixun Wang*, Yang Shi*, Letian Cheng, Zhuoran Zhang, Yan He, Yuqi Tang, Qi Zhang, Xinlei Yu, Ruizhe Chen, Tianrun Xu, Yuanxing Zhang, Pengfei Wan, Haotian Wang, Xianghua Ying
arXiv preprint, 2026
• Conduct a comprehensive analysis of reasoning-mode adaptiveness and tool-induced performance changes in existing agentic visual reasoning models.
• Propose a novel training recipe that achieves state-of-the-art or competitive performance across 13 visual reasoning benchmarks, while improving reasoning-mode adaptiveness and delivering genuine tool-induced performance gains.
PDF Code
CVPR
Monet: Reasoning in Latent Visual Space Beyond Images and Language
Qixun Wang, Yang Shi, Yifei Wang, Yuanxing Zhang, Pengfei Wan, Kun Gai, Xianghua Ying, and Yisen Wang
CVPR, 2026
• Propose a new framework for multimodal latent reasoning, including dataset construction, SFT, and RL algorithms, achieving significant improvements on both in-domain and OOD visual reasoning benchmarks
• 200+ GitHub stars
PDF Code
ICLR
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Qixun Wang, Yifei Wang, Xianghua Ying, and Yisen Wang
ICLR, 2025
• Reveal the capability boundary and algorithm selection mechanism of ICL on OOD tasks through carefully designed experiments and theoretical analysis.
PDF Code
NeurIPS
Dissecting the Failure of Invariant Learning on Graphs
Qixun Wang, Yifei Wang, Yisen Wang, and Xianghua Ying
NeurIPS, 2024
• Theoretically and empirically demonstrate the failure modes of classic invariant learning approaches on graph data, and propose a new training objective with significant performance gains and theoretical guarantees.
PDF Code
NeurIPS Spotlight
Improving Out-of-distribution Generalization by Adversarial Training with Structured Priors
Qixun Wang*, Yifei Wang*, Hong Zhu, and Yisen Wang
NeurIPS, 2022
• Propose a simple yet effective low-rank adversarial training strategy that improves the OOD generalization of visual recognition models.
PDF Code
ICML
Semantic-Enriched Latent Visual Reasoning
Tianrun Xu, Yue Sun, Qixun Wang, Jingyi Lu, Yuan Wang, Tianren Zhang, Longteng Guo, Fengyun Rao, Jing Lyu, Feng Chen, and Jing Liu
ICML, 2026
• Reveal and address the lack of semantic richness in latent embeddings learned by prior latent visual reasoning training paradigms.
PDF Code
arXiv
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos
Yuqi Tang*, Yang Shi*, Zhuoran Zhang*, Qixun Wang*, and a group of outstanding researchers
arXiv preprint, 2026
• Propose a comprehensive benchmark for evaluating MLLMs’ ability to detect and analyze artifacts in AI-generated videos
PDF Code

Experience

Kling Team, Kuaishou Technology (快手科技) — Research Intern (March 2026–present)
Working on multimodal agents.

Education

Ph.D. Candidate in Machine Learning and Computer Vision, School of Intelligence Science and Technology, Peking University (2023–present)
B.S. in Intelligence Science and Technology, EECS, Peking University (2019–2023)

Awards

The Third-Class Scholarship of Peking University (2025)
Merit Student at Peking University (2025)
Outstanding Graduate of Peking University (2023)
Yanchuang Capital Scholarship, Top 6% (2022)
Merit Student at Peking University, Top 6% (2022)
Academic Innovation Award at Peking University, Top 1% (2022)
Award for Academic Excellence (2021)