Efficient Implementation and Analysis of CMOS Arithmetic Circuits
Doctoral thesis, 2003
With the latter part of the last century in mind, it is not hard to imagine that in the foreseeable future there will be a never-ending demand for increased data processing capability. In addition, there will be a larger public wish for a versatile mobile device such as a combined computer, phone, and camera with many hours of operation. In such a world, deep understanding of speed-limiting and power-dissipating phenomena will be crucial for meeting design targets within reasonable time (= money). Designers of these future circuits will be facing many implementation choices so means of guiding them to fast and correct choices should be sought for.
Arithmetic circuits are power-hungry and performance-limiting in applications such as microprocessors and DSPs. By improving the arithmetic circuits, the overall performance can be significantly improved. This thesis will present efficient implementations of CMOS adders and multipliers as well as an investigation on the voltage scaling of glitches in future CMOS process technologies. An extension of the logical-effort delay model (which is an example of a guideline for the designer) to include the gate-input to output transition of pass-transistors in complementary pass-transistor logic circuit is also suggested.
Three efficient adder implementations are proposed. First, one Manchester adder where repeaters and an optimized bypass structure are employed to obtain an area and energy-efficient design. Second, an efficient implementation of the dot-operator cells of parallel-prefix adders, and the third one is rather an accumulator in a special application where it is possible to achieve a lower power dissipation at the cost of extra noise.
A comprehensive comparison of ten different multipliers has been conducted in terms of delay, power, area, and design effort. Among these ten multipliers were new structures, e.g. the RRT (which facilitates the use of bypass circuitry throughout the reduction tree) and a modified Dadda multiplier. Well-known multipliers such as array, overturned stairs, and TDM multipliers were also included in the comparison. To be able to find critical delays in the multipliers, a new way of finding bad-case test vectors were suggested and used.