Scalable Matrix Multiplication with Hybrid CMOS-RSFQ Digital Signal Processor
Paper i proceeding, 2007

We report an RSFQ Digital Signal Processor design based on hybrid RSFQ-CMOS memory suitable for a general matrix-on-matrix multiplication algorithm. The design consists of an RSFQ Multiply-Accumulate Unit, memory caches and synchronization block, partitioned into multiple chips, and a large CMOS memory. The complexity of the RSFQ DSP is 10x10 multiplication, rounding to 14 bits, 18-bits accumulator and 4.4 Kb memory cache. The maximum simulated clock frequency is equal to 24 GHz for HYPRES 4.5 kA/cm2 process and optimum communication bandwidth with CMOS memory is 2 Gbps. The simplified version of the RSFQ DSP consisting of 4x4 MAC with rounding to 5 bits and 17x6 memory caches has been designed for HYPRES 4.5 kA/cm2 process and fabricated.


Multiply-Accumulate Unit.

hybrid memory



Irina Kataeva

Fasta tillståndets elektronik

Henrik Engseth

Fasta tillståndets elektronik

Samuel Intiso

Chalmers, Mikroteknologi och nanovetenskap (MC2)

Anna Kidiyarova-Shevchenko

Fasta tillståndets elektronik

oral presentation Applied Superconductivity Conference (ASC) Sept 2006, Seattle, will be published in IEEE Transactions on Applied Superconductivity June 2007

Vol. June