Abstract:
Computational intensity is traditionally the focus of large-scale computing system designs, generally leaving such designs ill-equipped to efficiently handle throughput-oriented workloads. In addition, cost and energy consumption considerations for large-scale computing systems in general remain a source of concern. A potential solution involves using low-cost, low-power ARM processors in large arrays in a manner which provides massive parallelisation and high rates of data throughput (relative to existing large-scale computing designs). Giving greater priority to both throughput-rate and cost considerations increases the relevance of primary memory performance and design optimisations to overall system performance. Using several primary memory performance benchmarks to evaluate various aspects of RAM and cache performance, we provide characterisations of the performances of four different models of ARM-based system-on-chip, namely the Cortex-A9, Cortex-A7, Cortex-A15 r3p2 and Cortex-A15 r3p3. We then discuss the relevance of these results to high volume computing and the potential for ARM processors.