AESIP: Arch-aware ASIP-ISA Co-Design via Program Synthesis, Equality Saturation, and External Don’t Cares

Published in ISCA 2026 (Under review), 2026

Recommended citation: Haoran Jin, Nathaniel Bleier. AESIP: Arch-aware ASIP-ISA Co-Design via Program Synthesis, Equality Saturation, and External Don't Cares. ISCA 2026 (Under review).

An increasing number of applications, including implantables, IoT devices, and printed electronics, impose stringent power and area constraints. With the end of Dennard scaling, Application Specific Integrated Processors (ASIPs) have emerged as a promising solution, reducing power consumption and silicon area by sacrificing generality compared to general-purpose processors. However, existing approaches predominantly optimize hardware through software profiling, potentially overlooking optimization opportunities through software rewriting.

We propose AESIP, a hardware-software co-optimization framework for efficient ASIP design. Our framework leverages e-graph data structures to explore semantically equivalent software implementations through rewriting rules derived from program synthesis. We employ a divide-and-conquer approach that performs local saturation at basic-block granularity followed by Integer Linear Programming (ILP)-based global extraction at whole-program scope, enabling scalability to real-world applications. We develop a don’t care-based hardware optimizer that automatically generates ASIP designs for each equivalent program variant, enabling agile design space exploration. We further incorporate a clustering algorithm that groups applications with similar rewrite/ISA characteristics, along with a constraint-based sharing mechanism that limits the number of ASIP variants, thereby enabling efficient reuse across workloads and balancing area/power efficiency with NRE cost.

To our knowledge, this represents the first systematic hardware-software co-design framework for agile ASIP development. Experimental evaluation on widely used embedded benchmark suites, MiBench and Embench, demonstrates up to 29.7% area reduction and 8.3% power savings compared to state-of-the-art ASIP generators. Moreover, AESIP exploits generalization opportunities across diverse workloads without compromising specialization. Our trade-off analysis shows that using only four shared ASIPs, AESIP achieves a 14.17% area reduction with just 0.23% latency overhead, whereas even fully per-application specialization attains only a 19.11% area reduction at the same latency target.