Binary Scaling - Re-scaling After Multiplication

Re-scaling After Multiplication

The example above for a B16 multiplication is a simplified example. Re-scaling depends on both the B scale value and the word size. B16 is often used in 32 bit systems because it works simply by multiplying and dividing by 65536 (or shifting 16 bits).

Consider the Binary Point in a signed 32 bit word thus:

0 1 2 3 4 5 6 7 8 9 S X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X

where S is the sign bit and X are the other bits.

Placing the binary point at

1 gives a range of -1.0 to 0.999999.
2 gives a range of -2.0 to 1.999999
3 gives a range of -4.0 to 3.999999 and so on.

When using different B scalings the complete B scaling formula must be used.

Consider a 32 bit word size, and two variables, one with a B scaling of 2 and the other with a scaling of 4.

1.4 @ B2 is 1.4 * (2wordsize-2-1) == 1.4 * 2 ^ 29 == 0x2CCCCCCD

Note that here the 1.4 values is very well represented with 30 fraction bits! A 32 bit real number has 23 bits to store the fraction in. This is why B scaling is always more accurate than floating point of the same word size. This is especially useful in integrators or repeated summing of small quantities where rounding error can be a subtle but very dangerous problem, when using floating point.

Now a larger number 15.2 at B4.

15.2 @ B4 is 15.2 * (2 ^ (wordsize-4-1)) == 15.2 * 2 ^ 27 == 0x7999999A

Again the number of bits to store the fraction is 28 bits. Multiplying these 32 bit numbers give the 64 bit result 0x1547AE14A51EB852

This result is in B7 in a 64 bit word. Shifting it down by 32 bits gives the result in B7 in 32 bits.

0x1547AE14

To convert back to floating point, divide this by (2^(wordsize-7-1)) == 21.2800000099

Various scalings maybe used. B0 for instance can be used to represent any number between -1 and 0.999999999.

Read more about this topic: Binary Scaling