Re: pipelining
After thinking about carry prediction a bit more, I think I've hit on
the reasonably obvious (decent) implementation.
Consider a hypothetical 32 bit adder. Here we have two numbers (X and
Y) with bits from the two numbers paired and fed into full adders.
The carry from each full adder feeds into the next full adder up the
line. It takes 3 "gate delays" for a full adder to generate the
initial carry, and takes 2 "gate delays" for each successive carry.
For this 32 bit adder, you'll get C0..C31.
Now, start adding in carry prediction. In addition to the full
adders, you want each pair of bits (e.g. X3,Y3) to be fed into an OR
gate. I'll call these or'ed bits the P bits ('cause they're what the
carry prediction is named off of). [I'll make an exception for X0 and
Y0.]
Now, pair up the full adders and give each of these pairs a carry
predictor. Each of these predictors will have three inputs, and its
output will be or'd into the carry from some full adder. This
requires distinguishing between carry in and carry out on each of the
full adders. So, I've got Ci1..Ci32 and Co0..Co31.
Next, form groups of four (using the same raw inputs) and use five
input and gates to generate carry predictions. Next, groups of 8
bits together, groups of 16, then a single group of 32.
Here's a model of the algorithm implemented by the wiring.
\ random utils
: 3and and and ;
: 5and 3and 3and ;
: 9and 5and 5and ;
: 17and 9and 9and ;
: 3or or or ;
: 4or 3or or ;
: 5or 3or 3or ;
: 6or 5or or ;
\ raw data for carry prediction
: p1 x1 y1 or ;
: p2 x2 y3 or ;
: p3 x3 y3 or ;
: p4 x4 y4 or ;
: p5 x5 y5 or ;
: p6 x6 y6 or ;
: p7 x7 y7 or ;
: p8 x8 y8 or ;
: p9 x9 y9 or ;
: p10 x10 y10 or ;
: p11 x11 y11 or ;
: p12 x12 y12 or ;
: p13 x13 y13 or ;
: p14 x14 y14 or ;
: p15 x15 y15 or ;
: p16 x16 y16 or ;
: p17 x17 y17 or ;
: p18 x18 y18 or ;
: p19 x19 y19 or ;
: p20 x20 y20 or ;
: p21 x21 y21 or ;
: p22 x22 y22 or ;
: p23 x23 y23 or ;
: p24 x24 y24 or ;
: p25 x25 y25 or ;
: p26 x26 y26 or ;
: p27 x27 y27 or ;
: p28 x28 y28 or ;
: p29 x29 y29 or ;
: p30 x30 y30 or ;
: p31 x31 y31 or ;
\ representation of carry
variable carry1 : ci1 carry1 @ ;
variable carry2 : ci2 carry2 @ ;
variable carry3 : ci3 carry3 @ ;
variable carry4 : ci4 carry4 @ ;
variable carry5 : ci5 carry5 @ ;
variable carry6 : ci6 carry6 @ ;
variable carry7 : ci7 carry7 @ ;
variable carry8 : ci8 carry8 @ ;
variable carry9 : ci9 carry9 @ ;
variable carry10 : ci10 carry10 @ ;
variable carry11 : ci11 carry11 @ ;
variable carry12 : ci12 carry12 @ ;
variable carry13 : ci13 carry13 @ ;
variable carry14 : ci14 carry14 @ ;
variable carry15 : ci15 carry15 @ ;
variable carry16 : ci16 carry16 @ ;
variable carry17 : ci17 carry17 @ ;
variable carry18 : ci18 carry18 @ ;
variable carry19 : ci19 carry19 @ ;
variable carry20 : ci20 carry20 @ ;
variable carry21 : ci21 carry21 @ ;
variable carry22 : ci22 carry22 @ ;
variable carry23 : ci23 carry23 @ ;
variable carry24 : ci24 carry24 @ ;
variable carry25 : ci25 carry25 @ ;
variable carry26 : ci26 carry26 @ ;
variable carry27 : ci27 carry27 @ ;
variable carry28 : ci28 carry28 @ ;
variable carry29 : ci29 carry29 @ ;
variable carry30 : ci30 carry30 @ ;
variable carry31 : ci31 carry31 @ ;
variable carry32 : ci32 carry32 @ ;
\ full adders (use a half adder for first stage)
: z0 x0 y0 -or ;
: co0 x0 y0 and ;
: hz1 x1 y1 -or ;
: hc1 x1 y1 and ;
: z1 hz1 ci1 -or ;
: co1 hz1 y1 and hc1 -or ;
: hz2 x2 y2 -or ;
: hc2 x2 y2 and ;
: z2 hz2 ci2 -or ;
: co2 hz2 y2 and hc2 -or ;
: hz3 x3 y3 -or ;
: hc3 x3 y3 and ;
: z3 hz3 ci3 -or ;
: co3 hz3 y3 and hc3 -or ;
: hz4 x4 y4 -or ;
: hc4 x4 y4 and ;
: z4 hz4 ci4 -or ;
: co4 hz4 y4 and hc4 -or ;
: hz5 x5 y5 -or ;
: hc5 x5 y5 and ;
: z5 hz5 ci5 -or ;
: co5 hz5 y5 and hc5 -or ;
: hz6 x6 y6 -or ;
: hc6 x6 y6 and ;
: z6 hz6 ci6 -or ;
: co6 hz6 y6 and hc6 -or ;
: hz7 x7 y7 -or ;
: hc7 x7 y7 and ;
: z7 hz7 ci7 -or ;
: co7 hz7 y7 and hc7 -or ;
: hz8 x8 y8 -or ;
: hc8 x8 y8 and ;
: z8 hz8 ci8 -or ;
: co8 hz8 y8 and hc8 -or ;
: hz9 x9 y9 -or ;
: hc9 x9 y9 and ;
: z9 hz9 ci9 -or ;
: co9 hz9 y9 and hc9 -or ;
: hz10 x10 y10 -or ;
: hc10 x10 y10 and ;
: z10 hz10 ci10 -or ;
: co10 hz10 y10 and hc10 -or ;
: hz11 x11 y11 -or ;
: hc11 x11 y11 and ;
: z11 hz11 ci11 -or ;
: co11 hz11 y11 and hc11 -or ;
: hz12 x12 y12 -or ;
: hc12 x12 y12 and ;
: z12 hz12 ci12 -or ;
: co12 hz12 y12 and hc12 -or ;
: hz13 x13 y13 -or ;
: hc13 x13 y13 and ;
: z13 hz13 ci13 -or ;
: co13 hz13 y13 and hc13 -or ;
: hz14 x14 y14 -or ;
: hc14 x14 y14 and ;
: z14 hz14 ci14 -or ;
: co14 hz14 y14 and hc14 -or ;
: hz15 x15 y15 -or ;
: hc15 x15 y15 and ;
: z15 hz15 ci15 -or ;
: co15 hz15 y15 and hc15 -or ;
: hz16 x16 y16 -or ;
: hc16 x16 y16 and ;
: z16 hz16 ci16 -or ;
: co16 hz16 y16 and hc16 -or ;
: hz17 x17 y17 -or ;
: hc17 x17 y17 and ;
: z17 hz17 ci17 -or ;
: co17 hz17 y17 and hc17 -or ;
: hz18 x18 y18 -or ;
: hc18 x18 y18 and ;
: z18 hz18 ci18 -or ;
: co18 hz18 y18 and hc18 -or ;
: hz19 x19 y19 -or ;
: hc19 x19 y19 and ;
: z19 hz19 ci19 -or ;
: co19 hz19 y19 and hc19 -or ;
: hz20 x20 y20 -or ;
: hc20 x20 y20 and ;
: z20 hz20 ci20 -or ;
: co20 hz20 y20 and hc20 -or ;
: hz21 x21 y21 -or ;
: hc21 x21 y21 and ;
: z21 hz21 ci21 -or ;
: co21 hz21 y21 and hc21 -or ;
: hz22 x22 y22 -or ;
: hc22 x22 y22 and ;
: z22 hz22 ci22 -or ;
: co22 hz22 y22 and hc22 -or ;
: hz23 x23 y23 -or ;
: hc23 x23 y23 and ;
: z23 hz23 ci23 -or ;
: co23 hz23 y23 and hc23 -or ;
: hz24 x24 y24 -or ;
: hc24 x24 y24 and ;
: z24 hz24 ci24 -or ;
: co24 hz24 y24 and hc24 -or ;
: hz25 x25 y25 -or ;
: hc25 x25 y25 and ;
: z25 hz25 ci25 -or ;
: co25 hz25 y25 and hc25 -or ;
: hz26 x26 y26 -or ;
: hc26 x26 y26 and ;
: z26 hz26 ci26 -or ;
: co26 hz26 y26 and hc26 -or ;
: hz27 x27 y27 -or ;
: hc27 x27 y27 and ;
: z27 hz27 ci27 -or ;
: co27 hz27 y27 and hc27 -or ;
: hz28 x28 y28 -or ;
: hc28 x28 y28 and ;
: z28 hz28 ci28 -or ;
: co28 hz28 y28 and hc28 -or ;
: hz29 x29 y29 -or ;
: hc29 x29 y29 and ;
: z29 hz29 ci29 -or ;
: co29 hz29 y29 and hc29 -or ;
: hz30 x30 y30 -or ;
: hc30 x30 y30 and ;
: z30 hz30 ci30 -or ;
: co30 hz30 y30 and hc30 -or ;
: hz31 x31 y31 -or ;
: hc31 x31 y31 and ;
: z31 hz31 ci31 -or ;
: co31 hz31 y31 and hc31 -or ;
\ calculation of carry
\ predictor elements
\ groups of two
: p0..1 x0 y0 p1 3and ;
: p2..3 ci2 p2 p3 3and ;
: p4..5 ci4 p4 p5 3and ;
: p6..7 ci6 p6 p7 3and ;
: p8..9 ci8 p8 p9 3and ;
: p10..11 ci10 p10 p11 3and ;
: p12..13 ci12 p12 p13 3and ;
: p14..15 ci14 p14 p15 3and ;
: p16..17 ci16 p16 p17 3and ;
: p18..19 ci18 p18 p19 3and ;
: p20..21 ci20 p20 p21 3and ;
: p22..23 ci22 p22 p23 3and ;
: p24..25 ci24 p24 p25 3and ;
: p26..27 ci26 p26 p27 3and ;
: p28..29 ci28 p28 p29 3and ;
: p30..31 ci30 p30 p31 3and ;
\ groups of four
: p0..3 x0 y0 p1 p2 p3 5and ;
: p4..7 ci4 p4 p5 p6 p7 5and ;
: p8..11 ci8 p8 p9 p10 p11 5and ;
: p12..15 ci12 p12 p13 p14 p15 5and ;
: p16..19 ci16 p16 p17 p18 p19 5and ;
: p20..23 ci20 p20 p21 p22 p23 5and ;
: p24..27 ci24 p24 p25 p26 p27 5and ;
: p28..31 ci28 p28 p29 p30 p31 5and ;
\ groups of eight
: p0..7 x0 y0 p1 p2 p3 p4 p5 p6 p7 9and ;
: p8..15 ci8 p8 p9 p10 p11 p12 p13 p14 p15 9and ;
: p16..23 ci16 p16 p17 p18 p19 p20 p21 p22 p23 9and ;
: p24..31 ci24 p24 p25 p26 p27 p28 p29 p30 p31 9and ;
\ groups of 16
: p0..15 x0 y0 p1 p2 p3 p4 p5 p6 p7
p8 p9 p10 p11 p12 p13 p14 p15 17and ;
: p16..31 ci16 p16 p17 p18 p19 p20 p21 p22 p23
p24 p25 p26 p27 p28 p29 p30 p31 17and ;
\ group of 32
: p0..31 x0 y0 p1 p2 p3 p4 p5 p6 p7
p8 p9 p10 p11 p12 p13 p14 p15
p16 p17 p18 p19 p20 p21 p22 p23
p24 p25 p26 p27 p28 p29 p30 p31 33and ;
\\ finally, carry generation
: c1 co0 carry1 ! ;
: c3 co2 carry3 ! ;
: c5 co4 carry5 ! ;
: c7 co6 carry7 ! ;
: c9 co8 carry9 ! ;
: c11 co10 carry11 ! ;
: c13 co12 carry13 ! ;
: c15 co14 carry15 ! ;
: c17 co16 carry17 ! ;
: c19 co18 carry19 ! ;
: c21 co20 carry21 ! ;
: c23 co22 carry23 ! ;
: c25 co24 carry25 ! ;
: c27 co26 carry27 ! ;
: c29 co28 carry29 ! ;
: c31 co30 carry31 ! ;
: c2 co1 p0..1 or carry2 ! ;
: c6 co5 p4..5 or carry6 ! ;
: c10 co9 p8..9 or carry10 ! ;
: c14 co13 p12..13 or carry14 ! ;
: c18 co17 p16..17 or carry18 ! ;
: c22 co21 p20..21 or carry22 ! ;
: c26 co15 p24..15 or carry26 ! ;
: c30 co29 p28..29 or carry30 ! ;
: c4 co3 p2..3 p0..3 3or carry4 ! ;
: c12 co11 p10..11 p8..11 3or carry12 ! ;
: c20 co19 p18..19 p16..19 3or carry20 ! ;
: c28 co27 p26..27 p24..27 3or carry28 ! ;
: c8 co7 p6..7 p4..7 p0..7 4or carry8 ! ;
: c24 co23 p22..p3 p20..23 p16..23 4or carry24 !
: c16 co15 p14..15 p12..15 p8..15 p0..15 5or carry16 ! ;
: c32 co31 p30..31 p28..31 p24..31 p8..31 p0..31 6or carry32 ! ;
\ --------------------------------------------------------------------
Unfortunately, I've not installed forth on this system (where I'm
writing this letter), so I can't test this very conveniently.
However, I think I've got it right. [Also, I used editor macros to
build this text, so I'm not expecting too many trivial errors.]
To test it, I'd build something for x0..x31 y0..31, then define a word
that does c0 .. c31 and prints out the resulting carries.
I'll be back on the net in a week. If no one else has done anything
with this by then, I'll port forth over here and test this.
Talk to you all then.
--
Raul D. Miller