CSE423 1998

^CSE423^ >1999 plan>

From [dld] Wed Jun 9 17:37:05 1999
Subject: CSC423 MML Hons "Learning and Prediction" 1998

The entry at [web] gives what Graham and I outlined and did last year, and reads as follows :

Topics include:

elementary information theory (including noiseless coding and Huffman codes);
elementary foundations of inductive inference;
introduction to Minimum Message Length (MML) inference;
MML approaches to
- clustering,
- unsupervised classification,
- decision trees,
- causal modelling,
- data mining.
Applications to be considered include:
- image compression,
- models of protein folding,
- bushfire prediction,
- DNA alignment and the human genome project,
- authorship identification for texts, etc.

Graham gave the first 12 lectures and I gave the second 12 of the 24 lectures. Graham had about 35 pages of (©) material [GF:70p] that he went through for these 12 lectures, and he covered

some coding theory (which he put in the MML Hons syllabus before it has now arrived in the 3rd year Formal Methods II syllabus),
a bit about probabilistic prediction (and footy-tipping),
maybe a very little about Strict MML and
MML estimators.
GF also did the binomial and maybe also
multinomial distribution, with Maximum Likelihood and (with uniform prior), MML and posterior mean = minEKL.
He also covered the sqrt(12/F) stuff for one continuous parameter, and might have done several continuous parameters.

DLD has (©) material covering:

Fisher info (F), interpreting F in one and many dimensions, invariance of Max L'hood and of MML, invariance, consistency (Dowe's conjecture).
Max L'hood and MML for
- binomial,
- multinomial (and posterior mean = min E K-L for these),
- Gaussian,
- Poisson,
- (briefly) von Mises.
Could also do geometric and logistic. Simulation results for von Mises distribution, philosophical and pragmatic issues re general choice of prior.
Classification, clustering, mixture modelling.
- Dowe-Allison-Dix-Hunter-Wallace-Edgoose (1996) and
- Edgoose-Allison-Dowe (1998) von Mises protein stuff.
- Inconsistency from total assignment in mixture modelling; Neyman-Scott problem.
Decision trees, decision graphs and applications.
- Binary trees (i.e., binary regressor attributes) and binary leaves.
- Binary trees (i.e., binary regressor attributes) and multinomial leaves.
- Ternary trees (i.e., ternary regressor attributes) and multinomial leaves.
- Arbitrary n-ary trees (i.e., n-ary regressor attributes) and multinomial leaves.
- Continuous-valued regressor attributes Beta(alpha,beta) priors on bi/multi nomial leaves in decision trees
- Search problem and look-ahead
- Decision graphs
- Dowe-Krusel (93-94) Appl'n of d trees to (probabistic) bush-fire prediction
- Dowe-Oliver-Allison-Dix-Wallace (1993) Appl'n of d trees to protein folding
Probabilistic Finite State Automata (PFSAs)

At this point, there were about 1 or 2 lectures left, so I did some glossing.

A very little bit or less of glossing about D. Loo DNA work with T.I. Dix and me
A very little bit or less of glossing about linear regression (and polynomial regression) and causal nets
A very little bit or less of glossing about factor analysis
A very little bit or less of glossing about spherical von Mises-Fisher distr'n
A very little bit or less of glossing about MDL, MML, Kolmogorov, UTMs, SMML
A very little bit or less of glossing about Efficient Markets, Turing Test.

David.