I am a third-year Ph.D. candidate with Visual Intelligence Lab at Nanyang Technological University (NTU), supervised by Prof. Shijian Lu. Prior to joining NTU, I obtained my B.S. degree in Computing Science from University of Alberta. I am currently working on native multimodal foundation models at Kimi (Moonshot AI), advised by Dr. Haoning Wu and Xinyu Zhou. Previously, I worked closely with Dr. Lidong Bing at MiroMind and Dr. Song Bai at ByteDance. I also enjoy vibe building with other researchers at LMMs-Lab, a non-profit open-source organization led by Dr. Bo Li and Prof. Ziwei Liu. My research centers on the long-standing quest for building video-centric multimodal intelligence, spanning temporal grounding, agentic reasoning, long-horizon tool use, and self-evolving multi-agent systems.

I am open to any interesting ideas, questions, and future opportunities. Feel free to contact me via WeChat: 17310143309.

Exciting News

  • 2026.05 - ParaVT, PRISM, and WorldReasonBench were released. Several papers on Audio-Visual Captioning, Multimodal Evaluation, and Database Agent are coming soon.
  • 2026.04 - Evolving Visual Generation was released.
  • 2026.03 - MiroThinker-1.7 & H1 was released.
  • 2026.02 - Four papers were accepted by CVPR 2026.
  • 2025.10 - One paper was accepted by SIGGRAPH Asia 2025.
  • 2025.08 - One paper was accepted by EMNLP 2025.
  • 2025.06 - Two papers were accepted by ICCV 2025.
  • 2025.05 - Two papers were accepted by ACL 2025.
  • 2023.09 - One paper was accepted by NeurIPS 2023.

Selected Publications (Full List)

Academic Services

Conference Reviewer

  • CVPR 24/25/26, ECCV 24/26, ACM MM 24/25/26, NeurIPS 24/25/26, ICLR 25, AISTATS 25/26, ICML 25, ICCV 25, BMVC 26

Journal Reviewer

  • IEEE TPAMI, Pattern Recognition, Journal of Electronic Imaging

Workshop PC Member

Teaching Assistant

  • AI6121 - Computer Vision, NTU, Fall 2025 / Spring 2026

Invited Talks

Technical Blogs

Chinese Blogs

English Blogs

Patent & Awards

  • Method, Device, and Medium for Video Temporal Grounding with Mixture-of-Experts, US Patent, 2025
  • Method, Device, and Medium for Generating Transition Videos with Diffusion Model, SG Patent, 2024
  • Method, Device, and Medium for Automatic Question-Answering, CN Patent, 2022
  • Outstanding Graduate with Distinction, University of Alberta, 2021
  • Dean’s Honor Roll Award, University of Alberta, 2018 - 2020
  • International Student Scholarship, University of Alberta, 2017 - 2019