Tianchen Zhao

Tianchen Zhao 赵天辰 (Ziu Tinsan)

Phd. Student at Tsinghua University

Tsinghua University

Interests
  • Foundation Model for Agent & VisualGen
  • EfficientML & AI Infra
Education
  • PhD in Electrical Engineering, 2023

    Tsinghua University

  • MS in Electrical Engineering, 2020

    Beihang University

  • BSc in Electrical Engineering, 2016

    Beihang University

Biography

Tianchen Zhao is a Phd. student in NICS-EFC Lab-(EffAlg) at Dept. EE, Tsinghua University, supervised by Prof. Yu Wang and Dr. Xuefei Ning. He got his bachelor and master degree in the Dept. EE Beihang University in 2020 and 2023. His primary research focus is EfficientML Algorithms and AI Infrastructure for Building Foundation Models.

I’m expected to graduate in June 2027, I’m currently interested in seeking postdoc positions and industrial opportunities, plzzzzz contact me if you are interested 👋✨ You could find my CV and 简历.

News

Research Timeline

Please use desktop mode to check.
Q: How to build efficient foundation models in an efficient way

Line 2: Infra for Agentic RL: Long-tail Rollout

NAS/AutoML

- [ECCV'20] DSA: Differentiable Structure Pruning for CNNs

- [CVPR'24] FlashEval: AutoML-based Efficient Data Selection for Evaluation

3D

- [ICCV'23] Ada3D: Efficient adaptive dynamic architecture for 3D point cloud understanding

- [CVPR'22] CodedVTR: Novel Codebook-based 3D Attention for 3D Transformer Backbone Design

EfficientML for Model Arch.

(Algo System Co-opt for Sparse/Quant)

Intern@Infinigence

- [ECCV'24] MixDQ: Mixed-precision Quantizaiton for GEMMs in VisualGen

- [ICLR'25] ViDiT-Q: Quantization for Diffusion Transformers in VisualGen

- [DAC'25] PARO: Accelerator for Mixed-precision Quantization for Attention in VisualGen

Intern@ByteDance

- [NeurIPS'25] PAROAttention: Sparse & Quant for Attention for VisualGen

- [MLSys'25] dp-SP: Multi-GPU Load Balancing for Sparse Attention in VisualGen

- [Ongoing] TideQuant: Algo-level improvement for FP4 Quantization.

Agentic RL Post-train

(Multi-Agent GRPO)

Intern@MiroMind

- RL with VeRL : multi-agent GRPO and context management RL post train.

RL Infra: Tile-based Engine & Backend Schedule

(Low-latency Tile-based Runtime)

- (Ongoing) Tile-based Runtime Develop Tile-based Runtime for Latency-sensitive Agentic long-tailed Rollout

Efficient Sampling for Diffusion

(Improved Flow Matching for Better Sampling Efficiency)

- [ECCV'26 Sub.] StreamingVLA: Streaming flow matching for async execution for VLAs.

- [Ongoing] Streaming Forcing: Streaming flow matching for frame-wise AR video gen models.

Line 1: EfficientML & Sampling for VisualGen Foundation Model Design

2020 2023 2026
Tips: You could click to get more information for each work.

Research Framework

Please use desktop mode to check.

Agent Loop Orchestration

Orchestration icon
Orchestration

Multi-Request Scheduling

Scheduling icon
Scheduling

How Sampling Produces Tokens (MTP)

Sampling icon
Sampling

Per-Model Engine for Single-Token Decoding

Engine icon
Engine

Per-Operator Path (e.g. Attn/MoE)

Operator icon
Operator

Agentic RL post-training algorithms (GRPO)
and infrastructure practice (VeRL)

Line 2: Infra for Agentic RL: Long-tail Rollout

Novel flow matching formulation for streaming VLA and videogen

Line 1: EfficientML & Sampling for VisualGen

Sparse and Quantization: from algorithm to kernel design. (Mixed-precision quantization & Sparse Attention for image and video dits.)

[MLSys’25] dp-SP

Multi-GPU sparse attention load balancing

Tile-based Megakernel-like Runtime targeted as low latency inference for small batch long sequence rollouts.

Tips: You could click to get more information for each work.

Publications

Recent Events

Visit at KIT
Visit the ITIV Lab at KIT(Karlsruhe Institute of Technology).
Visit at KIT

Gallery