Zhou, Yaoyang

Zhou, Yaoyang

Architect of LLM DSA; Maintainer of u-arch simulator for Xiangshan; PhD of Computer Architecture

Beijing Institute of Open Source Chip

Biography

I am interested in LLM inference and CPU micro-architecture.

For LLM inference, I am working on

  • Next generation LLM inference chip
  • Speculative decoding

In Oct. 2024, we initiate a new project on LLM inference in Beijing Institute of Open Source Chip (BOSC). We will build RISC-V cores equipped RISC-V matrix extensions and high-bandwidth network-on-chip (NoC) to support LLM inference. I’m looking to collaborate on LLM-DSA projects with guys with strong expertise in CPU/accelerator design, software-hardware co-design, and algorithm-hardware co-design. To join us, please e-mail me: archshinez AT outlook DOT com or shinezyy AT qq DOT com.

For CPU performance, I am experienced in

  • Prefetchers
  • Workload characterization
  • Performance counter architecture
  • Performance evaluation framework

During 2022-2024, I led the performance analysis and modeling team of Xiangshan processor in Beijing Institute of Open Source Chip (BOSC). Our team played a significant role in the design of 3rd generation architecture of the Xiangshan processor, achieving a SPECint2k6 score of 15/GHz on both C++ simulator and RTL. I will continue maintaining the micro-architecture simulator for Xiangshan and contribute to the open-source community when leading the LLM inference project in BOSC.

My hobbies include playing badmiton, investment, and reading papers on health and biology. I obtained my Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Sciences, and B.Sc. degree from Nanjing University.

Interests
  • LLM inference
  • CPU micro-architecture
  • Investment
  • Badminton
Education
  • PhD in Computer Architecture, 2017 - 2023

    Institute of Computing Technology, CAS

  • BSc in Computer Science, 2013 - 2017

    Nanjing University

Publications

Quickly discover relevant content by filtering publications.
(2021). Omegaflow: a high-performance dependency-based architecture. In ICS 2021.

Cite

(2021). QoSMT: supporting precise performance control for simultaneous multithreading architecture. In ICS 2019.

Cite