A new approach to the design of optimal parallel prefix circuits
Parallel prefix is one of the fundamental algorithms in computer science. Parallel prefix networks are used to compute carries in fast addition circuits, and have a number of other applications, including the computation of linear recurrences and loop parallelization. A new construction, called Slices, for fan-out-constrained depth size optimal (DSO) parallel prefix circuits is presented. The construction encompasses the largest possible number of inputs for given depth and fan-out. It improves on previous approaches that produce DSO networks with constrained fan-out by encompassing more inputs for a given depth. Even when compared with parallel prefix circuits with unbounded fan-out, the construction provides a new family of circuits that are both small and reasonably shallow. We present the construction, which is composed of recursively defined blocks, and derive a recurrence for the maximum number of inputs that can be processed for a given fan-out and depth.
We also show how a DSO network built according to our construction can be cropped, to produce a new DSO network with the same depth and fan-out, but fewer inputs. Thus, we can produce a DSO network for given depth, fan-out and number of inputs, provided such a network exists. We believe that we are the first to be able to do this. The resulting networks are compared to others with both bounded and unbounded fan-out.
depth size optimal parallel prefix circuits
parallel prefix networks