In mesh-based many-core architectures, processor cores and memories reside in different locations (center, corner, edge, etc.), therefore memory accesses behave differently due to their different communication distances. The latency difference leads to unfair memory access and some memory accesses with very high latencies, degrading the system performance. However, improving one memory access's latency can worsen the latency of another since memory accesses contend in the network. Therefore, the goal should focus on memory access fairness through balancing the latencies of memory accesses while ensuring a low average latency. In the paper, we address the goal by proposing to predict the round-trip latencies of memory access related packets and use the predicted round-trip latencies to prioritize the packets. The router supporting fair memory access is designed and its hardware cost is given. Experiments are carried out with a variety of network sizes and packet injection rates and prove that our approach outperforms the classic round-robin arbitration in terms of average latency and LSD1. In the experiments, the maximum improvement of the average latency and the LSD are 16% and 48% respectively.
[1]
William J. Dally.
Virtual-Channel Flow Control
,
1992,
IEEE Trans. Parallel Distributed Syst..
[2]
Daniela Genius.
Measuring memory access latency for software objects in a NUMA system-on-chip architecture
,
2013,
2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC).
[3]
David Z. Pan,et al.
An SDRAM-Aware Router for Networks-on-Chip
,
2010,
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[4]
G. G. Stokes.
"J."
,
1890,
The New Yale Book of Quotations.
[5]
Partha Pratim Pande,et al.
Performance evaluation and design trade-offs for network-on-chip interconnect architectures
,
2005,
IEEE Transactions on Computers.