Publications

EponaV2: Driving World Model with Comprehensive Future Reasoning

Arxiv, 2026

Jiawei Xu, Zhizhou Zhong, Zhijian Shu, Mingkai Jia, Mingxiao Li, Jia-Wang Bian, Qian Zhang, Kaicheng Zhang, Jin Xie, Jian Yang, Wei Yin

EponaV2 is a novel, perception-free driving world model that achieves state-of-the-art trajectory planning by forecasting comprehensive future 3D geometry and semantic representations and employing an LLM-inspired policy optimization mechanism to enhance real-world reasoning and scene understanding.

Paper Arxiv Code

VGGT-Long: Chunk it, Loop it, Align it, Pushing VGGT’s Limits on Kilometer-scale Long RGB Sequences

ICRA, 2026

Kai Deng, Zexin Ti, Jiawei Xu, Jian Yang, Jin Xie

To overcome the memory limitations of 3D vision foundation models, VGGT-Long employs a chunk-based processing strategy with overlapping alignment and loop closure optimization to enable accurate, kilometer-scale monocular 3D reconstruction in unbounded outdoor environments without requiring camera calibration or depth supervision.

Paper Arxiv Code

AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving

ICCV, 2025

Jiawei Xu, Kai Deng, Zexin Fan, Shenlong Wang, Jin Xie, Jian Yang

AD-GS is a novel, self-supervised framework for high-quality rendering of dynamic urban driving scenes that eliminates the need for manual annotations by combining a learnable motion model, simplified segmentation, and dynamic Gaussians to achieve performance competitive with state-of-the-art, annotation-dependent approaches.

Website Paper Arxiv Code Slides Poster

Grid4D: 4D Decomposed Hash Encoding for High-Fidelity Dynamic Gaussian Splatting

NeurIPS, 2024

Jiawei Xu, Zexin Fan, Jian Yang, Jin Xie

To overcome the limitations of plane-based methods in Gaussian-based dynamic scene rendering, Grid4D introduces a novel explicit 4D hash encoding with directional attention and smooth regularization, achieving state-of-the-art visual quality and rendering speed.