Publications
LongVT: Incentivizing “Thinking with Long Videos” via Native Tool Calling
Zuhao Yang*, Sudong Wang*, Kaichen Zhang*, Keming Wu, Sicong Leng, Yifan Zhang, Chengwei Qin, Bo Li, Shijian Lu, Xingxuan Li, Lidong Bing
Preprint 2025
[paper] [bibtex] [code]OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Kaichen Zhang*, Keming Wu*, Zuhao Yang, Bo Li, Kairui Hu, Bin Wang, Ziwei Liu, Xingxuan Li, Lidong Bing
Preprint 2025
[paper] [bibtex] [code]A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
[paper] [bibtex]ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance
Duo Li*, Zuhao Yang*, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2025
[paper] [bibtex]SVAgent: Storyline-guided Long Video Understanding via Cross-modal Multi-agent Collaboration
Zhongyu Yang, Zuhao Yang, Shuo Zhan, Tan Yue, Wei Pang, Yingfang Yuan
Under ReviewReChar: Revitalising Characters with Structure-Preserved and User-Specified Aesthetic Enhancements
Zhongyu Yang, Junhao Song, Zhang Luo, Zuhao Yang, Yang Xu, Jingfen Lan, Yonghan Zhang, Wei Pang, Siyang Song, Yingfang Yuan
SIGGRAPH Asia 2025
[bibtex]Evaluating Text Generation Quality Using Spectral Distances of Surprisal
Zhichen Liu, Yongyuan Li, Yang Xu, Yu Wang, Yingfang Yuan, Zuhao Yang
EMNLP 2025
[paper] [bibtex] [code]TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
Zuhao Yang, Yingchen Yu, Yunqing Zhao, Shijian Lu, Song Bai
ICCV 2025
[paper] [bibtex] [webpage]Versatile Transition Generation with Image-to-Video Diffusion
Zuhao Yang, Jiahui Zhang, Yingchen Yu, Shijian Lu, Song Bai
ICCV 2025
[paper] [bibtex] [webpage]QAEval: Mixture of Evaluators for Question-Answering Task Evaluation
Tan Yue, Rui Mao, Xuzhao Shi, Shuo Zhan, Zuhao Yang, Dongyan Zhao
ACL 2025
[paper] [bibtex] [code]AI-Generated Images as Data Sources: The Dawn of Synthetic Era
Zuhao Yang, Fangneng Zhan, Kunhao Liu, Muyu Xu, Xiaoqin Zhang, Ling Shao, Shijian Lu
Preprint 2023
[paper] [bibtex] [webpage]FACE: Evaluating Natural Language Generation with Fourier Analysis of Cross-Entropy
Zuhao Yang*, Yingfang Yuan*, Yang Xu*, Shuo Zhan, Huajun Bai, Kefan Chen
NeurIPS 2023
[paper] [bibtex] [code]PaCaNet: A Study on CycleGAN with Transfer Learning for Diversifying Fused Chinese Painting and Calligraphy
Zuhao Yang*, Huajun Bai*, Zhang Luo, Yang Xu, Wei Pang, Yue Wang, Yisheng Yuan, Yeqi Hu, Yingfang Yuan
Preprint 2023
[paper] [bibtex]
