^CSE2304^

## Tutorial 3, CSE2304, CSSE, Monash, Semester 1, 2005

### Weeks 7-9,  18 April to 6 May

Class: Prepare your answers to the questions before the tutorial! It will probably not be possible to cover all questions unless the class has prepared them all in advance. (The online versions of the tutorials may contain extra information and links.)
Tutors: (i) The purpose of the tutorials is not to solve the practical exercises! (ii) The purpose is to check answers, and to discuss particular sticking points, not to simply make answers available.

Objectives: The tutorials give practice in problem solving, analysis of algorithms and data-structures, and mathematics and logic useful in the above.

1. Re linked list data structures, the foldl algorithm (fold-left) performs the cumulative operation derived from a binary function, f, and a "zero", z:

append Nil L = L
append (Cons h t) L = Cons h (append t L)

foldl f z Nil = z
foldl f z (Cons h t) = foldl f (f z h) t

e.g., foldl plus 0 [1,2,3,4] = (((0+1)+2)+3)+4 = 10, and foldl times 1 [1,2,3,4] = 4! = 24.

(i) If g is associative, that is g (g x y) v = g x (g y v), for all x, y and v, and if g z w = w and g w z = w, for all w, prove formally (as in [e.g.]) that g (foldl g z L1) (foldl g z L2) = foldl g z (append L1 L2), for all lists L1 and L2.
(ii) Give C-code for foldl(f,z,L){...}.
(For interest only, there is an article on functional programming in C++ here: [doi:10.1017/S0956796803004969], Mcnamara & Smaragdakis (2004).)

2. The [edit-distance] (ED), covered in lectures, tells how different two strings, S1 and S2, are, d(S1,S2)=0, if S1=S2. An optimal ED-alignment shows how to edit S1 into S2 for minimum cost.
Here is a related but different problem to solve: The longest common subsequence (LCS) is a similarity measure, |lcs(S1,S2)|=|S1|=|S2|, if S1=S2. An optimal LCS-alignment shows the LCS of S1 and S2.
(Given a sequence x1, x2,..., xn, a subsequence is some (or all) of its elements, in the same order, e.g., x2, x4, x5, x7, say. A common subsequence of two sequences is a subsequence of both of them. A longest common subsequence is a common subsequence that is as long as possible.)
(i) Give an example of two strings, S1 and S2, such that the optimal ED- and LCS-alignments are different.
(ii) Define an algorithm to compute the length of the LCS of S1 and S2.
(iii) Give C-code for the algorithm.
(iv) What is the time- and space-complexity of the algorithm? Why?

3. Given the following algorithm how many times are (i) "base" and (ii) "body" executed for r(10)?
```void r(int n)
{ if( n > 0 )
{ body();
r(n-1);
}
else
base();
}
```
Why?

4. The unit is about problem solving. Here is a problem:
Two intervals, [a,b] and [c,d], where a, b, c and d are floating-point numbers between 0.0 and 1.0 inclusive, a<=b and c<=d, can overlap in the following ways:
```1. [a__________b]             2.    [a_______b]
[c__________d]            [c_____d]

3. [a_________b]              4.      [a___b]
[c__d]                     [c_____________d]
```
You are working in a scientific laboratory where a long-running experiment is producing a sequence of intervals [i1,j1], [i2,j2], ..., [in,jn]. For each interval, [im,jm], your colleagues need to know which of the earlier intervals, [ik,jk] k<m, if any, overlap with [im,jm]. When m becomes large you expect the number of earlier intervals overlapping [im,jm] to be <<m (much less than m) in most, but not all, cases. Note that n can be very large indeed, in the millions or even more, and you need an efficient solution.
Is the problem easy or hard? Why? Outline a possible solution, i.e. algorithms and data-structures. (Do not give C code.) What is worst-case data for your solution? Why?

© L. Allison 2005, School of Computer Science and Software Engineering, Monash University, Australia 3800.
Created with "vi (Linux & Solaris)",   charset=iso-8859-1