Assignment #04: rounding errors¶

Today's exercises deal with the pitfalls of floating point arithmetics.

Exercise #04-01: a new format based on fixed point binary numbers¶

Write a function which converts binary strings to decimal numbers. The function should handle unsigned (positive) numbers only. Examples:

'101010' $\rightarrow$ 42
'10000.1' $\rightarrow$ 16.5
'1011101101.10101011' $\rightarrow$ 749.66796875

Now let's develop a new standard based on this representation. Dots cannot be represented by 0s and 1s, so that if we want the position of the dot to be flexible we need an additional memory slot to store this position. Let's define our new format as a 32 bits long sequence of bits, the first 5 bits (starting from the left) being used to give the position of the dot, and the remaining 27 bits used to represent the number. Examples:

'10101010101010101010101010101010' $\rightarrow$ 699050.65625.
'00000001100110011001100110011001' $\rightarrow$ 0.19999999552965164.

Explanation for example 1: the first five digits are '10101' which gives the number 21. The second part of the string therefore becomes a dot at position 21: '010101010101010101010.101010'. This binary number is then converted to decimal.

Let's name this standard "BSE" (for "best standard ever"), and try to convince the Institute of Electrical and Electronics Engineers to adopt it in place of the old IEEE 754 standard. We have to answer the following questions:

what is the smallest number the BSE can represent? The largest?
what is the maximal accuracy of the BSE? (in other words, what is the difference between the smallest positive number and zero?)
what is the lowest accuracy of our standard? (in other words, what is the difference between the largest number we can represent and the second largest?)
does the difference between two nearest representable change, when the dot position doesn't?
now compute the precision of our format for a range of possible values of the BSE
for these values, compare the BSE to the IEEE754 binary32 format (or its numpy equivalent np.float32) using numpy.nextafter.
(optional: you can also use matplotlib and a log-log plot to produce something similar to the wikipedia page on IEEE 754)

Conclude. Do you think we should try to convince the Institute of Electrical and Electronics Engineers and present them our results?

Final note: the BSE format is not the IEEE754 format. I'm just saying, because some people got confused during the exam and remembered the BSE better than the real floating point representation...

Exercise #04-02: exponential error growth¶

The number e can be defined as the sum of the infinite series:

$$e = \sum_{n=0}^{\infty} \frac{1}{n!}$$

We are going to approximate this number by truncating the sum to a finite value. We use the standard library and it's math module:

import math
n = 100

e1 = sum([1. / math.factorial(i) for i in range(n+1)])
e1

2.7182818284590455

Close enough! Now let's compute it with the same values, but summed from n=100 to n=0:

e2 = sum([1. / math.factorial(i) for i in range(n+1)][::-1])
e2

2.718281828459045

Seems reasonable too! Are they different?

e1 - e2

4.440892098500626e-16

Which of the two values is closest to the actual e? Explain why this occurs, and what we can learn from this experiment.

Back to the table of contents