RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2015 Volume 6, Issue 4, Pages 227–241 (Mi ps198)

This article is cited in 1 paper

Hardware and Software for Supercomputers

Fused multiply-adders using in vector dataflow processor

N. I. Dikarev, B. M. Shabanov, A. S. Shmelev

Joint Super Computer Center

Abstract: Dataflow processor is able to issue up to 16 instructions per clock in contrary to 4–6 instructions per clock for best von-Neumann processor design. Simulation of our vector dataflow processor shows that matrix multiplication performance reaches 256 flops per clock on less then eight instructions per clock issue and can keep almost peak performance on much smaller matrix dimensions compared to traditional processor. Advantages and disadvantages of floating point fused multiply-add execution units are also analyzed when using in our vector dataflow processor design. (In Russian).

Key words and phrases: supercomputer; vector processor; dataflow architecture; performance evaluation; fine grained parallelism; fused multiply-adders.

UDC: 004.27

Received: 16.11.2015
Accepted: 07.12.2015



© Steklov Math. Inst. of RAS, 2025