Energy-Efficient Computation of TensorFloat32 Numbers on an FP32 Multiplier
Paper i proceeding, 2025

Several new shorter floating-point formats have been proposed to match requirements of emerging application workloads. To simplify hardware development in the presence of an increasing number of formats, one practical design option is to use as much as possible preexisting hardware, such as standard 32-bit IEEE-754 (FP32) floating-point units, to handle emerging, less complex formats. We evaluate the case where we use an FP32 multiplier to run Nvidia TensorFloat32 data. While the FP32 multiplier area is not as small as a dedicated TensorFloat32 multiplier, we show that energy per operation scales well with the mantissa width reduction and that smart pin assignment can leverage uneven input vector switching activities to significantly decrease energy for reduced precisions.

Författare

Per Larsson-Edefors

Chalmers, Mikroteknologi och nanovetenskap, Mikrovågselektronik

IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC

23248432 (ISSN) 23248440 (eISSN)

IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC)
, Chile,

classIC - Chalmers Lund Center for Advanced Semiconductor System Design

Stiftelsen för Strategisk forskning (SSF) (CSS22-0003), 2023-06-01 -- 2029-05-31.

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2025)

Inbäddad systemteknik

Datorsystem

Mer information

Skapat

2025-10-06