The canonical binary adder is a cascade of elements known as full adders, so called to distinguish them from their constituent half adders. Each full adder is a circuit with three inputs and two outputs. Customarily two of the inputs are connected to the two input bits of a particular weight. The two outputs give the unsigned representation of the sum of the inputs. The low-order output becomes the adder ouput bit of the corresponding weight and the high-order (carry out) output is fed back to the third (carry in) input of the next full adder in the cascade.
Low-latency adders use a variety of techniques to compute the carry bits more quickly than a cascade of full adders could, but when the result is going to go through further sum reduction there is no pressing need to compute the high-order bits fully. Breaking the feedback path results in a set of independent full adders in place of an integrated binary adder. Each full adder can quickly reduce up to three input bits to two output bits representing the same quantity, passing the result to the next stage.
While it is not hard to see that a pipeline stage of independent full adders preserves the interpretation value of the vector passing through it, it is not obvious that the process terminates in a single binary number without special arrangements. This turns out to be the case for a generalized full adder; it is possible to clock the entire pipeline at a rate limited only by the delay in a full adder.
The generalization that is needed is to allow control inputs to mask the arithmetic inputs -- the generalized full adder computes a sum of the form where the are ones if not physically connected. For the remainder of this paper we assume as our unit of hardware cost a logic cell that is a slight simplification of the Xilinx FPGA logic cell. Each cell computes an arbitrary two-bit function of up to four input bits and optionally latches the result. The generalized full adder, constrained to a total of four inputs, maps directly onto the simplified Xilinx logic cell.
The initial AND stage of the sum-of-products pipeline is obviously accomodated by the generalized full adder model. The control signals are also essential for the feedback arrangement, where they are used to implement rules corresponding to that for k<0 in equation (9). It is occasionally useful to allow inputs of different weights as well; as long as the total value of the inputs cannot exceed three times the base weight for the cell this is valid.