The canonical binary adder is a cascade of elements known as
*full adders,* so called to distinguish them from their constituent
*half adders*. Each full adder is a circuit with three inputs and
two outputs. Customarily two of the inputs are connected to the two
input bits of a particular weight. The two outputs give the unsigned
representation of the sum of the inputs. The low-order output becomes
the adder ouput bit of the corresponding weight and the high-order
(carry out) output is fed back to the third (carry in) input of the
next full adder in the cascade.

Low-latency adders use a variety of techniques to compute the
carry bits more quickly than a cascade of full adders could, but
when the result is going to go through further sum reduction there
is no pressing need to compute the high-order bits fully. Breaking
the feedback path results in a set of independent full adders in
place of an integrated binary adder. Each full adder can quickly
reduce up to three input bits to two output bits
*representing the same quantity*, passing the
result to the next stage.

While it is not hard to see that a pipeline stage of independent full adders preserves the interpretation value of the vector passing through it, it is not obvious that the process terminates in a single binary number without special arrangements. This turns out to be the case for a generalized full adder; it is possible to clock the entire pipeline at a rate limited only by the delay in a full adder.

The generalization that is needed is to allow control inputs to mask the arithmetic inputs -- the generalized full adder computes a sum of the form where the are ones if not physically connected. For the remainder of this paper we assume as our unit of hardware cost a logic cell that is a slight simplification of the Xilinx FPGA logic cell. Each cell computes an arbitrary two-bit function of up to four input bits and optionally latches the result. The generalized full adder, constrained to a total of four inputs, maps directly onto the simplified Xilinx logic cell.

The initial AND stage of the sum-of-products pipeline is obviously
accomodated by the generalized full adder model. The control signals
are also essential for the feedback arrangement, where they are used
to implement rules corresponding to that for *k*<0
in equation (9).
It is occasionally useful to allow inputs of different weights as well; as
long as the total value of the inputs cannot exceed three times the
base weight for the cell this is valid.

Mon Dec 11 17:02:42 CST 2000