Partitioned Add

this principle also applies to subtract and negate (-)

the basic principle is: the partition bits, when inverted, can actually be inserted into an (expanded) add, and, if the bit is set, it has the side-effect of "rolling through" the carry bit of the MSB from the previous partition.

this is a really neat trick, basically, that allows the use of a straight "add" (DSP in an FPGA, add in a simulator) where otherwise it would be extraordinarily complex, CPU-intensive and take up large resources.

partition:     P    P    P     (3 bits)
a        : .... .... .... .... (32 bits)
b        : .... .... .... .... (32 bits)
exp-a    : ....P....P....P.... (32+3 bits, P=0 if no partition)
exp-b    : ....0....0....0.... (32 bits plus 3 zeros)
exp-o    : ....xN...xN...xN... (32+3 bits - x to be discarded)
o        : .... N... N... N... (32 bits - x ignored, N is carry-over)

new version:

partition:      p    p    p       (3 bits)
carry-in :      c    c    c    c (4 bits)
C = c & P:      C    C    C    c (4 bits)
I = P=>c :      I    I    I    c (4 bits)
a        :  AAAA AAAA AAAA AAAA  (32 bits)
b        :  BBBB BBBB BBBB BBBB  (32 bits)
exp-a    : 0AAAACAAAACAAAACAAAAc (32+3+2 bits, P=0 if no partition)
exp-b    : 0BBBBIBBBBIBBBBIBBBBc (32+2 bits plus 3 zeros)
exp-o    : o....oN...oN...oN...x (32+3+2 bits - x to be discarded)
o        :  .... N... N... N...  (32 bits - x ignored, N is carry-over)
carry-out: o    o    o    o      (4 bits)

the new version

  • brings in the carry-in (C) bits which, in combination with the Partition bits, are ANDed to create "C & p".
  • C is positioned twice (in both A and B‌) intermediates, which has the effect of preserving carry-out, yet only performing a carry-over if the carry-in bit (c) is set and this is part of a partition
  • o (carry-out) must be "cascaded" down to the relevant partition start-point. this can be done with a Mux-cascade.

carry-out-cascade example:

partition:      1    0    0    1      (4 bits)
actual   : <--->|<------------>|<---> actual numbers
carryotmp: o4   o3   o2   o1   o0     (5 bits)
cascade  : |    |    x    x    |      o2 and o1 ignored
carry-out: o4   \->  -->  o3   o0     (5 bits)

because the partitions subdivide the 5-wide input into 8-24-8, o4 is already in "both" the MSB-and-LSB position for the top 8-bit result; o3 is the carry-out for the 24-bit result and must be cascaded down to the beginning of the 24-bit partitioned result (the LSB), and o0, like o4, is already in position because the partition is only 1 wide.