Researchers investigate Tenstorrent Grayskull performance and efficiency
"In this manuscript, we evaluated the performance and efficiency of the Tenstorrent Grayskull e75 RISC-V accelerator in executing matrix-matrix multiplication (MatMul), a fundamental operation in deep learning. Our analysis characterized its execution model, revealing significant differences between initial and subsequent runs due to compilation and data movement overheads. We examined the impact of processor grid size, matrix dimensions, data formats and numerical fidelity on computational performance. The results demonstrate that Grayskull achieves competitive performance in terms of TFLOPs per Watt relative to SoA architectures, such as two NVIDIA GPUs (A100 and V100) and an Intel Sapphire Rapids processor. Whilst GPUs deliver higher raw throughput, Grayskull provides a promising alternative with a strong balance between performance and energy efficiency."
Other contents