Performance gain from data and control dependency elimination in embedded processors

This paper presents a way of increasing overall performance in embedded processors by introducing a multithreading interleaved execution model that can be applied to any Instruction Set Architecture. Usual acceleration techniques as superpipeline or branch prediction are not suited for embedded machines due to their inherent inefficiency. We will show that by removing dependencies within a processor and thus eliminating the need for extra hardware required for keeping the overall coherence, there will be a noticeable increase in performance (up to 450%) and also a decrease in size and power consumption. Also, this approach will maintain the backwards compatibility with the software legacy in order to keep the software changes to a minimum.