High performance VLSI modules for division and square root

Abstract This paper presents a new full parallel circuit for square root extraction based on a modified nonrestoring algorithm. These modifications make it possible to avoid auxiliary systems for the identification of exceptions, such as the zero partial remainder. Moreover, a combined division/square root circuit, with zero latency cycles during operation mode changes, is proposed. Both architectures are structured as pipelined cellular arrays in which carry-select adders are used in order to improve performance. Because non-redundant arithmetic is used, no additional conversion circuitry is required. The achievable performances make the proposed architectures suitable for high speed digital signal processors.