Artificial Intelligence and Integrated Circuits Research

KuanNet
Knowledge-Unified Attention Neural Network — Multi-Agent Reinforcement Learning with Echo State Networks for Chiplet TSV Assignment
Published — IEEE TVLSI 2026
Xiaomeng Wang1,*, Zhen Zhou2, Yang Yi1
1Bradley Dept. of ECE & Institute for Advanced Computing, Virginia Tech  ·  2Intel Corporation, Chandler, AZ  ·  *Corresponding author
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2026
Venue
IEEE TVLSI 2026
Accepted
30 March 2026

Abstract

Chiplet-based architectures require efficient Through-Silicon Via (TSV) assignment to optimize interconnect performance and system integration. Unlike traditional 3D integrated circuits, heterogeneous chiplet systems demand coordination across dies with varying sizes and functionalities, creating exponentially complex solution spaces that challenge existing optimization methods.

This paper introduces KuanNet (Knowledge-Unified Attention Neural Network), a multi-agent reinforcement learning framework integrating Echo State Networks (ESN) with attention mechanisms for chiplet TSV assignment. The key innovation is a knowledge-unified architecture with temporal-static decomposition: temporal features shared across agents are processed through both ESN reservoirs and skip connections, while static features remain agent-private, enabling coordinated decisions with temporal memory and spatial awareness.

Building on multi-agent deep deterministic policy gradient (MADDPG) with K-head attention critics, KuanNet demonstrates superior optimization performance over the state-of-the-art baseline across standard benchmark circuits of varying scale and complexity. Ablation studies validate individual component contributions of the KuanNet architecture.

Keywords

Chiplet design Heterogeneous integration Through-silicon via (TSV) TSV assignment Placement optimization Multi-agent reinforcement learning Echo state network Attention mechanism

Key Results

Evaluated on five industry-standard benchmarks (MCNC ami33, ami49; GSRC n100, n200, n300) across 3-tier and 4-tier configurations for both homogeneous 3D IC and heterogeneous chiplet topologies — 20 benchmark-design combinations in total.

20×–223×
Larger wirelength reductions vs. state-of-the-art ATT-TA baseline on 3-tier 3D IC configurations (average: 76×).
12×–266×
Larger wirelength reductions on 4-tier configurations (average: 82×).
4–6×
Fewer trainable temporal parameters than LSTM / GRU alternatives — fixed ESN reservoirs require only a linear readout.
Rank 1.50
ESN wins best overall average rank across four design configurations against LSTM, GRU, state-history concatenation, and EMA baselines.

Why it works — in one paragraph

Off-policy multi-agent RL with a shared replay buffer makes backpropagation-through-time (BPTT) infrastructurally impractical: each agent needs temporal reasoning, but you can't afford to store trajectories and run truncated BPTT across a MADDPG replay batch. KuanNet sidesteps the whole problem by using a fixed Echo State Network reservoir for the temporal pathway — only the linear readout is trained. This drops a temporal module into any feedforward-based multi-agent architecture with zero changes to the training loop, loss functions, or replay buffer, while still providing history-dependent reasoning that plain feedforward networks lack.

Method at a glance

Problem — what's being optimized

For each net crossing a die-to-die interface, routing distance is computed via a minimum-spanning-tree (MST) heuristic over the TSV positions and net pins. The objective sums MST wirelength contributions across every net and every interface; each agent perturbs its assigned TSV location to minimise the shared total.

Minimum spanning tree for TSV routing distance computation
Figure 1. Minimum spanning tree (MST) for TSV routing distance computation — 35 TSV locations connected via MST structure (red edges) to minimize total wirelength. [PDF]

Temporal-static decomposition

Observations are split into two streams:

Knowledge-unified neural network architecture
Figure 2. Knowledge-unified neural network architecture. Temporal features connect to the ESN reservoir via curved pathways (left) while also providing skip connections to the readout layer. The ESN reservoir state, temporal features, and static features are unified through concatenation before the final readout transformation. [PDF]

Training backbone

Multi-Agent Deep Deterministic Policy Gradient (MADDPG) with K-head attention critics. 20,000 episodes × 50 steps per episode. Gumbel-Softmax exploration with temperature annealing from 3 → 0.01. PyTorch on Apple Silicon (M4, Metal Performance Shaders).

KuanNet multi-agent architecture with attention
Figure 3. KuanNet multi-agent architecture with knowledge-unified processing. Each agent maintains an actor–critic pair with ESN + MLP dual-pathway input layers. The attention mechanism (zoomed view) processes all agents' actions for coordination. [PDF]

Action space

Each agent chooses between a local 8-neighborhood move or one of 3 randomly-sampled distant candidate locations — combining local refinement with global exploration. Sensitivity sweeps confirm 8-neighborhood × 3 distant candidates as the robust operating point.

Action space design for TSV assignment
Figure 4. Action space for each agent: the current TSV location (center), an 8-connected neighborhood for local perturbation, and a small set of distant empty-space candidates for global exploration. Neighborhood size and distant-candidate count are tunable — see sensitivity analysis in the paper. [PDF]

Benchmarks & Setup

Benchmark Source Blocks Configurations
ami33MCNC333-tier / 4-tier 3D IC + heterogeneous chiplet
ami49MCNC493-tier / 4-tier 3D IC + heterogeneous chiplet
n100GSRC1003-tier / 4-tier 3D IC + heterogeneous chiplet
n200GSRC2003-tier / 4-tier 3D IC + heterogeneous chiplet
n300GSRC3003-tier / 4-tier 3D IC + heterogeneous chiplet

Initial floorplans generated with FlexPlanner. Initial TSV placement: greedy centroid-based allocation to the closest valid empty grid location, random fallback. Chiplet benchmarks available at github.com/xmwa/placement_datasets.

3D IC vs Chiplet architecture comparison on GSRC n300
Figure 5. Structural comparison of 3D IC (left) vs. Chiplet (right) architectures using the GSRC n300 benchmark. The 3D IC stacks four identical dies vertically, while the Chiplet architecture features a horizontally-split top layer (5 total chiplets). Coloured rectangles represent modules; white regions show empty spaces for TSV placement. [PDF]

Downloads

Paper PDF
Benchmarks
Code
Coming soon
Slides
Coming soon
Awesome list
Reproducibility. Experiments were run on Apple Mac mini M4 (16 GB unified memory), Python 3.12, PyTorch with the Metal Performance Shaders (MPS) backend, and Hydra for configuration management. Training uses 20,000 episodes of 50 steps each with a 1M-entry replay buffer.

How to cite

Plain text

Xiaomeng Wang, Zhen Zhou, and Yang Yi. "Finetune Chiplet Design Floorplan via KuanNet," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2026.

BibTeX

@article{wang2026kuannet,
  title   = {Finetune Chiplet Design Floorplan via {KuanNet}},
  author  = {Wang, Xiaomeng and Zhou, Zhen and Yi, Yang},
  journal = {IEEE Transactions on Very Large Scale Integration (VLSI) Systems},
  year    = {2026},
  doi     = {10.1109/TVLSI.2026.3681746},
  url     = {https://doi.org/10.1109/TVLSI.2026.3681746}
}

Authors

Xiaomeng Wang ORCID 0000-0001-8822-003X

Ph.D. Candidate, Bradley Department of ECE & Institute for Advanced Computing, Virginia Tech. Research focus: ML-driven chiplet / 3D-IC physical design, reservoir computing for EDA. Corresponding author. Contact: scholar@wangxm.com · www.wangxm.com · github.com/xmwa

Zhen Zhou ORCID 0000-0002-3014-8167 · Senior Member, IEEE

Intel Corporation, Chandler, AZ.

Yang Yi ORCID 0000-0002-1354-0204 · Senior Member, IEEE

Professor, Bradley Department of ECE & Institute for Advanced Computing, Virginia Tech. BRICCS Lab.

More from the authors

R2CTA: Reinforcement Learning and Reservoir Computing based Chiplets TSV Assignment · Paper

Xiaomeng Wang and Yang Yi. In Proc. 26th International Symposium on Quality Electronic Design (ISQED), pp. 1–7, IEEE, 2025.   Project page →

Transforming AI Landscape with Neuromorphic Computing and Chiplets · Book chapter

Xiaomeng Wang, Zhen Zhou, and Yang Yi. In Energy-Efficient Devices and Circuits for Neuromorphic Computing, pp. 405–428, Elsevier, 2026.

Practical Tips for Machine Learning Research and Development · Blog

X. Wang. (2025). Practical Tips for Machine Learning Research and Development. [Online]. Available: blog.wangxm.com/2025/02/practical-tips-for-machine-learning-research-and-development/

Acknowledgments

This work was supported in part by the U.S. National Science Foundation (NSF) under Grants CCF-1750450, ECCS-1731928, ECCS-2128594, ECCS-2314813, and CCF-1937487.

The authors thank the BRICCS Lab at Virginia Tech for computational resources and technical discussions.

With thanks to lab members for discussions and feedback

This work is dedicated to my mother, YuKuan — the name of this framework carries hers. Thank you, Mom, for everything.