Hi there Welcome to my Homepage!

Hi! I am a junior (3rd-year) undergraduate student at Xidian University.

My research interests include MLLM, RL, Efficient AI and other emerging areas in AI. I enjoy exploring diverse research directions and collaborating with researchers across different fields.

Feel free to reach out if you are interested in collaboration or potential opportunities.

News

  • 2026.05 🎉🎉 I begin my internship at Alibaba Group.
  • 2026.04 🎉🎉 Two papers accepted to ACL 2026. (Not publicly available on arXiv yet due to visa-related clearance).
  • 2026.02 🎉🎉 One paper accepted to CVPR 2026. (Not publicly available on arXiv yet due to visa-related clearance).
  • 2026.01 🎉🎉 I begin my internship at TeleAI.
  • 2025.08 🎉🎉 One paper accepted to PRCV 2025.
  • 2025.07 🎉🎉 One paper accepted to ICCV 2025.
  • 2024.10 🎉🎉 I joined the HPC-AI Lab at NUS as a Research Assistant.

Experience

Alibaba Group
2026.05 - Present
MLLM Engineer Intern advised by Liang Ding and Xintong Wang
Institute of Artificial Intelligence of China Telecom
2026.01 - 2026.03
Start My Journey in LLM
National University of Singapore
2024.10 - 2025.10
Research Assistant at HPC-AI Lab advised by Yang You, Wangbo Zhao and Pengfei Zhou
Xidian University
2023.09 - Present
Rank 5/99, B.E at School of Telecommunication Engineering & Research Assistant advised by Xiumei Wang

Publications

(* equal contribution · † corresponding author · ‡ project leader)

wog
GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs
Weidong Tang, Jierui Li, Yueling Hou, Zihan Mei, Zhigang Tian, Weicheng Jiao, Can Zhang, Xinyan Wan, Zhiyuan Liang, Pengfei Zhou†, Yang You, Wangbo Zhao†
We propose GroupToM-Bench and show that current models fail at nonlinear group reasoning despite strong individual-level ToM, exposing a clear group cognitive gap.
ACL 2026 Oral   [arXiv] [code]
wog
MagicBench: Diagnosing Visual Agency Loss and Semantic Dependency in Multimodal LLMs
Tang Da Huang, Weidong Tang‡, Wen Qi Xu, Xianpeng Guo†
The authors introduce MagicBench, a video dataset of magic tricks, to test if Multimodal LLMs actually rely on visual physics or if they are overly dependent on language narratives (which, in magic tricks, are deliberately deceptive).
ACL 2026   [arXiv] [code]
wog
Efficient Video Object Segmentation and Tracking with Recurrent Dynamic Submodel
Weidong Tang, Zhiyuan Liang, Xinyan Wan, Chen Zhu, Zhaopan Xu, Pengfei Zhou†, Yan Song, Yang You, Wangbo Zhao†
Proposed a Recurrent Dynamic Submodel for efficient Video Object Segmentation and Tracking. By integrating temporal-prior-guided global dynamic routing and Importance-aware LoRA, it achieves an optimal trade-off between performance and speed using minimal trainable parameters and training data.
CVPR 2026   [Poster] [arXiv] [code]
wog
Inference-Time Scaling for Visual AutoRegressive modeling by Searching Representative Samples
Weidong Tang, Xinyan Wan, Xiumei Wang†
Explored inference-time scaling in discrete spaces by mapping them to continuous spaces to obtain density distributions, thereby optimizing the sampling of early coarse-scale features.
PRCV 2025   [Poster] [arXiv] [code]

Projects

ManiUniCon
WowPage
Weidong Tang, Yue Su.
In collaboration with Yue Su, I refined and improved his original homepage template. A clean standalone template version is coming soon.
Project   [code]

Awards

  • 2025.09, Second-Class Academic Scholarship, Xidian University (Ranked 5/99)

Services

  • 2025.06 – Present, AI Team Lead, 🌊Xidian–Inspur Club.
  • Reviewer for ICIC 2026.
  • Reviewer for PRCV 2025, 2026.

Talks