RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2019 Volume 10, Issue 4, Pages 201–217 (Mi ps359)

This article is cited in 2 papers

Hardware, software and distributed supercomputer systems

Fine-grained parallelism and higher core performance: advantages of vector dataflow processor

N. I. Dikarev, B. M. Shabanov, A. S. Shmelev

Joint Supercomputer Center of RAS

Abstract: Currently, the reserves in increasing the performance of modern processors are almost exhausted. The stagnation is evidenced by the absence of growth, both the clock frequency and the number of instructions executed per clock, which determine the scalar performance of the processor core. In vector dataflow processor under development, processor core performance looks increased up to 256 flops per clock, which is eight times higher than the latest Intel Xeon processors due to a higher fraction of vector execution. We show that that vector dataflow processor has a higher ratio of real performance to peak on programs such as bitonic sorting, matrix multiplication, and 2D Stencil compared to the best traditional architecture processors.

Key words and phrases: vector processor, dataflow architecture, shared-memory multiprocessor, performance evaluation.

UDC: 004.272.25:004.272.44
BBK: Ç971.32-043:22.151.511

MSC: Primary 65Y05; Secondary 68Q10, 08-04

Received: 19.11.2019
Accepted: 26.12.2019

DOI: 10.25209/2079-3316-2019-10-4-201-217



© Steklov Math. Inst. of RAS, 2025