Hi there Welcome to my Homepage!

Hi, I am Yutong Shen from the Faculty of Information Technology, Beijing University of Technology. I am an undergraduate student affiliated with the Comprehensive Robotics Laboratory, supervised by Prof. Xiaoyan Li. I am currently seeking PhD/MPhil opportunities.

Feel free to reach out if you are interested in collaboration or potential opportunities.

News

  • 2026.05 🚀🚀 Won the Honorable Mention in MCM/ICM.
  • 2026.04 🚀🚀 Submitted 3 papers to ACM MM'26.
  • 2026.03 🚀🚀 Submitted a paper to ECML-PKDD'26.
  • 2026.03 🚀🚀 Submitted a paper to IROS'26.
  • 2026.02 🎉🎉 The Project obtained support and academic collaborations with USTC.
  • 2026.02 😍😍 Our Work was accepted by CVPR'26 ViSCALE.
  • 2026.01 😍😍 Our Work was accepted by ICLR'26 Life Long Agent.
  • 2025.12 🎉🎉 I began my reseach collaboration with University of Hamburg and Agile Robots SE.
  • 2025.11 😍😍 Our Work was accepted by AAAI' 26.
  • 2025.10 😍😍 Our Work was accepted by ICVRV' 25.
  • 2025.08 🚀🚀 Submitted a paper to AAAI'26.
  • 2025.08 🚀🚀 The national 2nd Prize in China Robot Competition and RoboCup China Open.
  • 2025.08 🚀🚀 The national 2nd Prize in China University Intelligent Robot Creativity Competition.
  • 2025.08 🚀🚀 The 1st Prize in Group of 400-meter race in WHRG 2025 .
  • 2025.07 😍😍 Our Work was accepted by ACM MM' 25.
  • 2025.04 🎉🎉 Got a research intership with THU Media Lab.
  • 2025.04 🚀🚀 Submitted a paper to ACM MM' 25.
  • 2024.11 🚀🚀 I Won the National 1st Prize at China Intelligent Robot Fighting and Gymnastics Competition.
  • 2023.09 🎉🎉 I began my studies at BJUT.

Experience

Tsinghua University
2025.04 - 2025.08
Embodied AI research Intern advised by Tongtong Feng
Main contribution:Conduct research on cross-domain generalization and online world models for robotics.
University of Hamburg
2025.12 - 2026.03
Humanoid Robotics and RL advised by Lei Zhang
Humanoid Robot control and World Model.

Publications

(* equal contribution · † corresponding author · ‡ project leader)

MetaWorld-X: Hierarchical World Modeling via VLM-Orchestrated Experts for Humanoid Loco-Manipulation
Yutong Shen*, Hangxu Liu*, Penghui Liu, Jiashuo Luo, Yongkang Zhang, Rex Morvley, Chen Jiang, Jianwei Zhang and Lei Zhang†.
We present MetaWorld-X, a hierarchical world modeling framework with VLM-orchestrated expert policies for humanoid loco-manipulation tasks.
IROS 2026 Undereview  [arXiv] [Page]
wog
ALAS: Adaptive Long-Horizon Action Synthesis via Disentangled Environment and Self-State Representations
Yutong Shen, Hangxu Liu, Lei Zhang†, Penghui Liu, Yinqi Liu, Liuxiang Yang, Tongtong Feng†.
We propose a novel framework for long-horizon action synthesis, which is able to generate long-horizon actions in a few-shot manner.
2026 Undereview   [arXiv]
DETACH:Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts
Yutong Shen*, Hangxu Liu, Penghui Liu, Ruizhe Xia and Tongtong Feng†.
We present a disentangled method to decouple the observation and self-state, in order to improve the cross-domain generalization.
ICLR 2026 LLA  [arXiv]
MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer
Penghui Liu, Jiangshan Wang, Yutong Shen*, Shanhui Mo*, Chenyang Qi†, Jack Ma†.
We propose a DiT-based framework for multi-subject video motion transfer, using Mask-aware AMF to disentangle object motions and RectPC for stable sampling, and build the MultiMotionEval benchmark for evaluation.
AAAI 2026   [arXiv]
wog
MetaWorld: Skill Transfer and Composition in a Hierarchical World Model for Grounding High-Level Instructions
Yutong Shen, Hangxu Liu, Kailin Pei, Yinqi Liu, Ruizhe Xia, and Tongtong Feng†.
We present MetaWorld, a hierarchical world model that unifies VLM semantic planning, expert policy transfer, and latent dynamics control to enable humanoid loco-manipulation, significantly boosting reward and sample efficiency.
CVPR 2026 ViSCALE   [arXiv] [code]
wog
ManiCoG: Dynamic Concept Graphs on Action Manifolds for Long-Horizon Robot Manipulation
Haidong Huang,Yutong Shen, Xingwei Chen, Xiyuan Li, Yaohua Zhou, Jun Ma, Haiyue Zhu and Xiaocong Li†.
We propose ManiCoG, which dynamically extracts manifold-based concepts, builds transition graphs and guides diffusion policy. It achieves superior long-horizon robot manipulation performance on LIBERO with stable concept transitions.
RSS 2026 ExWBC   [arXiv] [code]
wog
SxPxC:Transforming Multi-Agent Social Simulation from Isolated Personas to Socialized Intelligence via Structural Decoupling
Yutong Shen, Yinqi Liu, Yinzhi Qin, Yaomin Wang, Hongda Sun, Qinghao Shao, Liyang Gao, Runzhuo Li and Zhi Zheng†.
We propose S×P×C to prioritize social structure above persona and context via dual-chain inference, boosting multi-agent structural compliance on our built BanG-Struct benchmark.
2026 Undereview   [arXiv] [code]
wog
MVSS: AUnified Framework for Multi-View Structured Survey Generation
Yinqi Liu, Yueqi Zhu, Yongkang Zhang, Feiran Liu, Yutong Shen, Yufei Sun, Xin Wang, Renzhao Liang, Yidong Wang and Cunxiang Wang†.
We propose a unified framework for multi-view structured survey generation, which can be applied to various survey generation tasks.
Undereview 2026   [arXiv]
wog
MESS:Deep Reasoning and Multimodal Agentic Framework for Evolutionary Survey Synthesis
Yinqi Liu, Yongkang Zhang, Yueqi Zhu, Ruyi Feng, Yutong Shen, Yidong Wang, Cunxiang Wang†, Yinghui Li.
We propose the framework to mine literature evolution and generate professional academic diagrams. It outperforms existing methods and approaches expert-level performance.
ACM MM 2026   [arXiv]
wog
The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework
Feiran Liu, Yuzhe Zhang, Xinyi Huang, Yinan Peng, Xinfeng Li†, Lixu Wang, Yutong Shen, Ranjie Duan, Simeng Qin, Xiaojun Jia, Qingsong Wen, Wei Dong.
We built the PAPI dataset and proposed HolmesEye, using VLMs and LLMs to infer user private attributes from images, exposing new multimodal privacy risks.
ACM MM 2025   [arXiv]

Projects

Awards

  • 2026.02, Honorable Mention in MCM/ICM 2026.
  • 2025.09, National 2nd Prize in China Robot Competition and RoboCup China Open.
  • 2025.09, Provincial 2nd Prize in China Robot Competition and RoboCup China Open.
  • 2025.08, 1st in the group in 400-meter race of WHRG 2025
  • 2025.08, National 2nd Prize in China University Intelligent Robot Creativity Competition.
  • 2025.06, Provincial 2nd Prize in China University Intelligent Robot Creativity Competition.
  • 2024.11, National 1st prize in China Intelligent Robot Fighting and Gymnastics Competition.

Services

  • Reviewer for MM 2026.
  • Reviewer for IROS 2026.
  • Reviewer for MMM 2025.
  • Reviewer for ICVRV 2025.
  • Reviewer for AAAI 2026.

Talks