Rate-optimal fully-static multiprocessor scheduling of data-flow signal processing programs

The authors introduce the notion of a perfect-rate data-flow program and show that these programs can always be executed in minimum time without requiring any unfolding or retiming operation at all. They show that unfolding any data-flow program beyond a certain factor does not lead to any further reduction in the execution time. This optimum unfolding factor is given by the least common multiple of the loop delay counts in the data-flow program graph. The authors show that unfolding with optimum unfolding factor reduces any iterative data-flow program to an equivalent perfect-rate data-flow program. They obtain upper bounds on the number of needed processors needed to achieve minimum-time schedules.<<ETX>>