[Subject home], [FAQ], [progress],   Bib', Alg's,  C , Java - L.A., Sunday, 28-Nov-2021 04:25:30 AEDT
Instructions: Topics discussed in these lecture notes are examinable unless otherwise indicated. You need to follow instructions, take more notes & draw diagrams especially as [indicated] or as done in lectures, work through examples, and do extra reading. Hyper-links not in [square brackets] are mainly for revision, for further reading, and for lecturers of the subject.

## Numerical:   Introduction

• i.e. floating point numbers   (real / float / double). Will take about two lectures.

• <exponent, mantissa>

• limited numerical accuracy
• e.g. 1/3 and 1/5 are not represented exactly

• perhaps 6 or 16 decimal digits
(cf Babbage's difference engine had (has) 30 decimal digits of precision and his Analytical Engine was to have 40 decimal digits of precision.)

• Problem: Solve equation   f(x)=0.
• Problem: Integrate function   . . .
Numerical accuracy and errors.
e.g. IEEE "single precision" floating-point  representation:
S exponent (8) mantissa or fraction (23)
0 1 2 3 4 5 6 7 8 9 1011 1213 1415 1617 1819 2021 2223 2425 2627 2829 3031
S EEEEEE EE FFFFF FFFFF FFFFF FFFFF FFF
• normal value = (-1)S . 2E-127 . (1.F)2
• +0:   E=0, F=0, S=0
• -0:   E=0, F=0, S=1
• NaN, not a number:   E=FF16=25510,   F<>0
• +oo  :   E=FF16=25510,   F=0, S=1
• -oo  :   E=FF16=25510,   F=0, S=0
• unnormalised   if E=0 & F<>0,   (-1)S . 2-126 . (0.F)2
• least +ve value   = 2-126 . (0.000000000000000000000012) = 2-149
(Also IEEE "double precision": 64-bit, S 1-bit, E 11-bits, F 52-bits.)

It is generally better to combine small numbers first before combining them with large numbers.

Consider   SUMi=1.. ( 1 / i )   -- sum to infinity is a divergent series

SUM i=1.. ( 1 / i )

Ideally the sum from 1 up to n and the sum from n down to 1 should be equal, but . . .

Also see [oneOverN.c].

There is a number   `delta'   s.t.   delta > 0   and yet apparently   1.0 = 1.0 - delta.
[lecturer: use the demo'; class: note value of delta & probable representation.]
In some cases limited numerical accuracy can cause severe errors . . .

• ? Big - ( Big' - small ) = ( Big - Big' ) + small   ?

• if Big = Big'

• then Big - ( Big' - small ) = Big - (Big - small) which may equal Big - Big = 0

• but   ( Big - Big' ) + small = 0 + small = small <> 0
e.g. consider Big = Big' = 1.0, and small = delta

### Problem: Solve f(x) = 0

• NB. requires f(x) to be continuous
• PRE: f(lo) < 0 and f(hi) > 0   or   v.v.

• loSign := sign( f( lo ) );
• repeat
• mid := (lo + hi) / 2;
• midSign := sign( f( mid ) );
• if midSign = loSign then
• lo := mid
• else if midSign = 0 then
• lo := hi := mid
• else   -- assert midSign = sign(f(hi))
• hi := mid
• end_if
• until hi - lo is "small enough"   -- [lecturer: draw illustration; class: take notes!]

Solving f(x)=0

• Binary search for f(x)=0 is "like" binary search in a lookup table

• But here we should not stop on   hi = lo   or similar

• because [______________________]

• [lecturer: use the demo'; class: take notes!]
Problem: Integration

See algorithm . . .
[lecturer: briefly or can skip; class: study at home]
```double rectangle(double f(double x), double lo, double hi, int N)
{ double width = (hi-lo)/N;
double sum = 0.0;

int i;
for(i=0; i < N; i++)
sum += f( lo+(i+0.5)*width );
*/ f() at centre of i-th interval */

return sum*width;
}
```

See algorithm . . .
[lecturer: briefly or can skip; class: study at home]
```double trapezoidal(double f(double x), double lo, double hi, int N)
{ double width = (hi-lo)/N;
double sum = (f(lo)+f(hi))/2;

int i;
for(i=1; i < N; i++)
sum += f( (lo*(N-i) + hi*i)/N );

return sum*width;
}
```
[lecturer: do derivation / proof; class: take notes]

See algorithm . . .
[lecturer: briefly or can skip; class: study at home]
```double Simpson(double f(double x), double lo, double hi, int N)
/* PRE: N is even */
{ double width = (hi-lo)/N;
double sum = f(lo)+f(hi);

int i, odd=true;
for(i=1; i < N; i++)
{ sum += f(lo+i*width) * (odd ? 4 : 2);
odd = !odd;
}

return sum*width/3;
}
```

### Integration

• [lecturer: use demo'; class: take notes!]

• [discuss results]

• [__________] rule is best

• [__________] rule is next best

• [__________] rule is worst

• (of the methods discussed)

## Numerical:   Summary

real / float and double.

• limited numerical accuracy

• errors can accumulate

• will a test for equality ever succeed?

• a strictly monotonic series may have a bound yet never reach it (re program termination).
© L.Allison, Sunday, 28-Nov-2021 04:25:30 AEDT users.monash.edu.au/~lloyd/tilde/CSC2/CSE2304/Notes/20numerical.shtml
Computer Science, Software Engineering, Science (Computer Science).