Zhou, Yaoyang

Zhou, Yaoyang

Architect of LLM DSA; Maintainer of u-arch simulator for Xiangshan; PhD of Computer Architecture

Beijing Institute of Open Source Chip

Biography

I am interested in LLM inference and CPU micro-architecture.

For LLM inference, I am interested in

During Oct. 2024 - Oct. 2025, I worked on XSAI (XSAI slides here, XSAI repo here). We hope to provide hardware support for modern LLM kernels in a CPU paradigm on Xianshan, and hide memory latency automatically with out-of-order execution and prefetching. See XSAI’s roadmap here.

For CPU performance, I am experienced in

  • Prefetchers
  • Workload characterization
  • Performance counter architecture
  • Performance evaluation framework

During 2022 - 2024, I led the performance analysis and modeling team of Xiangshan processor in Beijing Institute of Open Source Chip (BOSC). Our team played a significant role in the design of 3rd generation architecture of the Xiangshan processor, achieving a SPECint2k6 score of 15/GHz on both C++ simulator and RTL.

My hobbies include playing badmiton, investment. I obtained my Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Sciences, and B.Sc. degree from Nanjing University.

Interests
  • LLM inference
  • CPU micro-architecture
  • Investment
  • Badminton
Education
  • PhD in Computer Architecture, 2017 - 2023

    Institute of Computing Technology, CAS

  • BSc in Computer Science, 2013 - 2017

    Nanjing University

Recent Posts

Publications

Quickly discover relevant content by filtering publications.
(2021). Omegaflow: a high-performance dependency-based architecture. In ICS 2021.

Cite

(2021). QoSMT: supporting precise performance control for simultaneous multithreading architecture. In ICS 2019.

Cite