I am a second-year Ph.D. student with Visual Intelligence Lab at Nanyang Technological University (NTU), supervised by Prof. Shijian Lu. Prior to joining NTU, I obtained my B.S. degree in Computing Science from University of Alberta. I also work closely with Dr. Lidong Bing at MiroMind.ai and Dr. Song Bai when he was at ByteDance. My research centers on the long-standing quest for building video-centric multimodal intelligence, spanning controllable generation, temporal reasoning, agentic tool use, and long-term memory.
I enjoy collaborating with self-motivated researchers at LMMs-Lab, a non-profit open-source organization led by Bo Li and Prof. Ziwei Liu. Our mission is to advance large multimodal models with a shared vision of Feeling the AGI. We are actively looking for like-minded individuals to contribute to the community together!
🔥 Exciting News
- 2025.10 - Four papers were released, focusing on multimodal reasoning (OpenMMReasoner), multimodal agentic tool use (LongVT), and visual token redundancy in both MLLMs (ToDRE) and diffusion-based MLLMs.
- 2025.10 - One paper was accepted by SIGGRAPH Asia 2025.
- 2025.08 - One paper was accepted by EMNLP 2025.
- 2025.06 - Two papers were accepted by ICCV 2025.
- 2025.05 - Two papers were accepted by ACL 2025.
- 2023.09 - One paper was accepted by NeurIPS 2023.
📝 Selected Publications (Full List)

Zuhao Yang*, Sudong Wang*, Kaichen Zhang*, Keming Wu, Sicong Leng, Yifan Zhang, Chengwei Qin, Bo Li, Shijian Lu, Xingxuan Li, Lidong Bing
Preprint 2025
paper / bibtex / code

Kaichen Zhang*, Keming Wu*, Zuhao Yang, Bo Li, Kairui Hu, Bin Wang, Ziwei Liu, Xingxuan Li, Lidong Bing
Preprint 2025
paper / bibtex / code

Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
paper / bibtex

Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
paper / bibtex

Zuhao Yang, Yingchen Yu, Yunqing Zhao, Shijian Lu, Song Bai
ICCV 2025
paper / bibtex / webpage

Zuhao Yang, Jiahui Zhang, Yingchen Yu, Shijian Lu, Song Bai
ICCV 2025
paper / bibtex / webpage

Tan Yue, Rui Mao, Xuzhao Shi, Shuo Zhan, Zuhao Yang, Dongyan Zhao
ACL 2025
paper / bibtex / code

Zuhao Yang*, Yingfang Yuan*, Yang Xu*, Shuo Zhan, Huajun Bai, Kefan Chen
NeurIPS 2023
paper / bibtex / code
📖 Educational Background
- 2024.01 - Present: Doctor of Philosophy, College of Computing and Data Science, Nanyang Technological University
- 2022.08 - 2024.01: Master in Artificial Intelligence, College of Computing and Data Science, Nanyang Technological University
- 2017.09 - 2021.06: Bachelor in Computing Science, Department of Computing Science, University of Alberta
🧑⚖️ Working Experiences
- 2025.04 - Present: AI Scientist Intern, Shanda AI Research Institute & MiroMind.ai, Singapore
- 2023.11 - 2025.03: AI Research Intern, ByteDance Inc. & TikTok, Singapore
- 2021.05 - 2022.06: NLP Algorithm Engineer, TMI Robotics Technology, Shanghai
💻 Academic Services
Conference Reviewer
- CVPR 24/25/26, ECCV 24, ACMMM 24, NeurIPS 24/25, ICLR 25, AISTATS 25/26, ICML 25, ICCV 25
Journal Reviewer
- IEEE TPAMI, Pattern Recognition, Journal of Electronic Imaging
Workshop PC Member
- SyntaGen: Harnessing Generative Models for Synthetic Visual Datasets (CVPR 24/25)
- Neural Rendering Intelligence (CVPR 24)
Teaching Assistant
- AI6121 - Computer Vision, NTU, 2025 Fall
🏆 Patent & Awards
- Method, Device, and Medium for Video Temporal Grounding with Mixture-of-Experts, US Patent, 2025
- Method, Device, and Medium for Generating Transition Videos with Diffusion Model, SG Patent, 2024
- Method, Device, and Medium for Automatic Question-Answering, CN Patent, 2022
- Outstanding Graduate, University of Alberta, 2021
- Dean’s Honor Roll Award, University of Alberta, 2018 - 2020
- International Student Scholarship, University of Alberta, 2017 - 2019
