Homepage

I am a second-year Ph.D. student with Visual Intelligence Lab at Nanyang Technological University (NTU), supervised by Prof. Shijian Lu. Prior to joining NTU, I obtained my B.S. degree in Computing Science from University of Alberta. I also work closely with Dr. Lidong Bing at MiroMind and Dr. Song Bai when he was at ByteDance. My research centers on the long-standing quest for building video-centric multimodal intelligence, spanning temporal grounding, complex reasoning, agentic tool use, and self-evolving multi-agent systems.

I enjoy vibe building with other researchers/developers at LMMs-Lab, a non-profit open-source organization led by Bo Li and Prof. Ziwei Liu. Our mission is to advance large multimodal models (LMMs) with a shared vision of Feeling the AGI. We are actively looking for like-minded individuals to contribute to the community together!

🔥 Exciting News

2025.10 - Four papers were released, focusing on multimodal reasoning (OpenMMReasoner), long-video tool use (LongVT), and visual token redundancy in both autoregressive LMMs (ToDRE) and diffusion-based LMMs.
2025.10 - One paper was accepted by SIGGRAPH Asia 2025.
2025.08 - One paper was accepted by EMNLP 2025.
2025.06 - Two papers were accepted by ICCV 2025.
2025.05 - Two papers were accepted by ACL 2025.
2023.09 - One paper was accepted by NeurIPS 2023.

📝 Selected Publications (Full List)

Preprint

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
Zuhao Yang*, Sudong Wang*, Kaichen Zhang*, Keming Wu, Sicong Leng, Yifan Zhang, Bo Li, Chengwei Qin, Shijian Lu, Xingxuan Li, Lidong Bing
Preprint 2025
paper / bibtex / code

Preprint

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Kaichen Zhang*, Keming Wu*, Zuhao Yang, Bo Li, Kairui Hu, Bin Wang, Ziwei Liu, Xingxuan Li, Lidong Bing
Preprint 2025
paper / bibtex / code

Preprint

A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
paper / bibtex

Preprint

ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance
Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
paper / bibtex

ICCV

TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
Zuhao Yang, Yingchen Yu, Yunqing Zhao, Shijian Lu, Song Bai
ICCV 2025
paper / bibtex / webpage

ICCV

Versatile Transition Generation with Image-to-Video Diffusion
Zuhao Yang, Jiahui Zhang, Yingchen Yu, Shijian Lu, Song Bai
ICCV 2025
paper / bibtex / webpage

ACL

QAEval: Mixture of Evaluators for Question‑Answering Task Evaluation
Tan Yue, Rui Mao, Xuzhao Shi, Shuo Zhan, Zuhao Yang, Dongyan Zhao
ACL 2025
paper / bibtex / code

NeurIPS

FACE: Evaluating Natural Language Generation with Fourier Analysis of Cross‑Entropy
Zuhao Yang*, Yingfang Yuan*, Yang Xu*, Shuo Zhan, Huajun Bai, Kefan Chen
NeurIPS 2023
paper / bibtex / code

📖 Educational Background

2024.01 - Present: Doctor of Philosophy, College of Computing and Data Science, Nanyang Technological University
2022.08 - 2024.01: Master of Artificial Intelligence, College of Computing and Data Science, Nanyang Technological University
2017.09 - 2021.06: Bachelor in Computing Science, Department of Computing Science, University of Alberta

💼 Working Experience

2025.04 - Present: AI Scientist Intern, Shanda AI Research Institute & MiroMind.ai, Singapore
2023.11 - 2025.03: CV Research Intern, ByteDance Inc. & TikTok, Singapore
2021.05 - 2022.06: NLP Research Engineer, TMI Robotics Technology, Shanghai

✍️ Academic Services

Conference Reviewer

CVPR 24/25/26, ECCV 24/26, ACM MM 24/25, NeurIPS 24/25, ICLR 25, AISTATS 25/26, ICML 25, ICCV 25

Journal Reviewer

IEEE TPAMI, Pattern Recognition, Journal of Electronic Imaging

Workshop PC Member

Teaching Assistant

AI6121 - Computer Vision, NTU, Fall 2025

🏆 Patent & Awards

Method, Device, and Medium for Video Temporal Grounding with Mixture-of-Experts, US Patent, 2025
Method, Device, and Medium for Generating Transition Videos with Diffusion Model, SG Patent, 2024
Method, Device, and Medium for Automatic Question-Answering, CN Patent, 2022
Outstanding Graduate with Distinction, University of Alberta, 2021
Dean’s Honor Roll Award, University of Alberta, 2018 - 2020
International Student Scholarship, University of Alberta, 2017 - 2019

Zuhao Yang (杨祖豪)

🔥 Exciting News

📝 Selected Publications (Full List)

📖 Educational Background

💼 Working Experience

✍️ Academic Services

🏆 Patent & Awards