Zhiyao Li 李之尧

Zhiyao Li 李之尧

PhD, Researcher

ByteDance Seed

Biography

Zhiyao Li is currently an MLSys Researcher at Bytedance Seed. He obtained his PhD from the Institute for Interdisciplinary Information Science (IIIS) at Tsinghua University, advised by Prof. Mingyu Gao. He worked on 3D stacked memory-based dataflow accelerator as a research intern at Alibaba DAMO computing technology lab. After this, he worked with Prof. Kunle Olukotun as a visiting student researcher at Stanford University on dataflow AI accelerator design and optimization. After PhD graduation, he joined Huawei Network Technology Lab as a scale-up AI network system researcher. At late 2025, he joined Bytedance Seed as Machine Learning System Research Scientist. He obtained a bachelor degree from the Excellent Engineer Class of Computer Science and Technology, Chongqing University. His research interests include LLM training/serving on rack-scale AI cluster, low-latency AI accelerator design, high-performance scale-up/out network, and dynamic & sparse computing.

Interests
  • Rack-scale AI training/serving system
  • Computer Architecture
  • High-performance network system
  • Sparse Computing
Education
  • PhD in Computer System and Architecture, 2019 - 2024

    Tsinghua University

  • BSE in Computer Science and Technology, 2015 - 2019

    Chongqing University

Recent Publications

(2025). HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches. Appeared in the International Symposium on Computer Architecture (ISCA) 2025.

PDF Cite Code DOI

(2025). KAPLA: Scalable NN Accelerator Dataflow Design Space Structuring and Fast Exploring. Appeared in the Asia and South Pacific Design Automation Conference (ASPDAC) 2025.

PDF Cite Code DOI

(2025). Adyna: Accelerating Dynamic Neural Networks with Adaptive Scheduling. Appeared in the Symposium on High Performance Computer Architecture (HPCA) 2025.

PDF Cite

(2025). FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving. Submitted to the USENIX Annual Technical Conference (ATC) 2025.

PDF Cite

(2023). Spada: Accelerating Sparse Matrix Multiplication with Adaptive Dataflow. Appeared in the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2023.

PDF Cite Code Video DOI

Experience

 
 
 
 
 
Huawei Network Technology Lab
Researcher
Huawei Network Technology Lab
July 2024 – September 2025 Beijing
 
 
 
 
 
Stanford University
Visiting Student Researcher
Stanford University
July 2023 – December 2023 Palo Alto
 
 
 
 
 
Alibaba DAMO Academy
Research Intern
Alibaba DAMO Academy
August 2022 – July 2023 Beijing
 
 
 
 
 
Huawei
Research Intern
Huawei
July 2018 – November 2018 Chengdu