[CSE2304] [progress]

### Practical #2, semester 1, 2000 Group A: week 5, 27 March... Group B: week 6, 3 April...

The problem is to write a program which reads a sequence of `N' floating point numbers from a file, finds the `M' smallest values in the file, and prints them. M and the filename are given as command line parameters (see ..../C/Basic/). This operation might be useful if studying the tail of a probability distribution, for example.
N is potentially very large, probably too large for all of the N numbers to fit into memory.
M is expected to be much less than N, M<<N, certainly small enough for M numbers to fit in memory.
Write a program in C to carry out the task:

```   smallest M thefile
```
e.g. if thefile contains
```7.3
-3.14
10097.1
-99.0
7.3
-3.14
3.14
1000200.779
7.3
7.4```
then `smallest 4 thefile' are:
-99.0  -3.14  -3.14  3.14

You must NOT sort all N numbers and then print the first M values! Your program must not use more than O(M)-space. Efficiency is important.

• Group A: You must use an array, rather like a stack, to store the M smallest numbers seen so far, in sorted order, as they are collected. You might care to think about [Insertion Sort] although you must not use it, or any other sort, unchanged.

• Group B: You must use a priority queue, alias a Heap, as used in [Heap Sort] to store the M smallest numbers seen so far, in the Heap ordering (with the largest of the M-smallest to date at the top). You must not sort the M smallest elements into (asc | desc)ending order, they must be maintained as a `heap'. (It is therefore not necessary to print the M-smallest in ascending order.) You must not sort all N numbers and then print the first M values.

• [4 marks]

Both groups must instrument their programs so that the numbers of comparisons of numbers carried out are printed. What can you say about the best-, average- and/or worst-case time-complexity of your program as a function of M and N?
[1 mark]

Every program must be commented with appropriate assertions, pre- & post-conditions, and loop invariants, to show that the program is correct.
[1 mark]

Total: [6 marks]

© 2000, L. Allison, Comp. Sci. & SWE, Monash University, Australia