We have developed an embedded processor that supports asymmetric multiple processor (AQMP), symmetric multiple processor (SMP), and an AMP/SMP hybrid system. It contains four SH-X3 cores used to support cache coherency from that obtained using an SH-X2 core. In this paper, we evaluate the following three techniques to improve the processing performance and reduce the power consumption in parallel processing in the processor. The first technique is snoop controller (SiNC) to improve cache coherency performance. The performance overhead by snoop is decreased up to 0.1% when SPLASH-2 is executed. The second technique is detection and resolution of synonym problems so that we may not use the page coloring for page table management. The processes handling time in Linux is reduced by 29.4% compared with the case solved the problem with software. The third technique is the individual core clock frequency and the light sleep mode which is used to maintain the cache coherency even when the cores are stopped, to reduce the power consumption. The energy is decreased by 5.2% and 4.5%, respectively. As a result, the SH-X3 core achieved a performance that has scalability proportional to 0.72-0.93 times the number of cores and a power saving of 4.5-44.0% without increasing the execution time.
[1]
Anoop Gupta,et al.
The SPLASH-2 programs: characterization and methodological considerations
,
1995,
ISCA.
[2]
Yoshida Yutaka,et al.
A 4320MIPS Four-Processor Core SMP/AMP with Individually Managed Clock Frequency for Low Power Consumption
,
2007
.
[3]
John Goodacre,et al.
ARM MPCore; The streamlined and scalable ARM11 processor core
,
2007,
2007 Asia and South Pacific Design Automation Conference.
[4]
Michel Dubois,et al.
Virtual-address caches.2. Multiprocessor issues
,
1997,
IEEE Micro.
[5]
Hironori Kasahara,et al.
A 4320MIPS Four-Processor Core SMP/AMP with Individually Managed Clock Frequency for Low Power Consumption
,
2007,
2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.