The performance gap between memory subsystem and high-performance processors is ever-increasing. Prefetching is one method to bridge this performance gap. Prefetching has been proposed for array-based and pointer applications, typically using software-based techniques with the help of the compiler. Prefetching suffers from certain disadvantages such as an increase in memory traffic, an increase in the number of executed instructions, and an increase in memory requirement for some cases.In this paper, we investigate the technique of software caching for applications that perform searches or sorted insertions. For data structures larger than the processor data cache, such a search or sorted insert may result in multiple cache misses before the correct value is found. In a software caching approach, a small software buffer is maintained that records the most recently added values (these values are the ones that are later searched) along with their addresses and is consulted during a search or an insert. In this paper, we present results of our initial experiments. We found that for applications involving a search, software caching performs as high as 30% better than the original application. Also, this technique executes upto 14% less instructions and the number of cache accesses decrease by around upto 18%. We also compared this technique against software prefetching and found that in some cases, the performance improvement was as high as 40%. This technique also has an added advantage of using less memory space than prefetching (all the improvements were obtained for very small software caches). These initial results encourage a more in-depth study of this technique.
[1]
Todd C. Mowry,et al.
Compiler-based prefetching for recursive data structures
,
1996,
ASPLOS VII.
[2]
William Pugh,et al.
Skip Lists: A Probabilistic Alternative to Balanced Trees
,
1989,
WADS.
[3]
Chandra Krintz,et al.
Cache-conscious data placement
,
1998,
ASPLOS VIII.
[4]
James R. Larus,et al.
Cache-conscious structure layout
,
1999,
PLDI '99.
[5]
Chau-Wen Tseng,et al.
Improving Locality for Adaptive Irregular Scientific Codes
,
2000,
LCPC.
[6]
Ken Kennedy,et al.
Improving cache performance in dynamic applications through data and computation reorganization at run time
,
1999,
PLDI '99.
[7]
Luddy Harrison.
Examination of a memory access classification scheme for pointer-intensive and numeric programs
,
1996,
ICS '96.
[8]
Andreas Moshovos,et al.
Dependence based prefetching for linked data structures
,
1998,
ASPLOS VIII.
[9]
Per Stenström,et al.
A prefetching technique for irregular accesses to linked data structures
,
2000,
Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[10]
Donald Yeung,et al.
Evaluating the impact of memory system performance on software prefetching and locality optimizations
,
2001,
ICS '01.
[11]
Ken Kennedy,et al.
Software prefetching
,
1991,
ASPLOS IV.
[12]
Gurindar S. Sohi,et al.
Effective jump-pointer prefetching for linked data structures
,
1999,
ISCA.
[13]
Chau-Wen Tseng,et al.
Data transformations for eliminating conflict misses
,
1998,
PLDI.