^inductive programming^

## Mixture Modelling

Problem: Given a multivariate data-set

attributes
@1 @2 ... @k
things x1 x11 x12 ... x1k
x2 x21 x22 ... x2k
... ... ... ... ...
xn xn1 xn2 ... xnk

Does the data-set consist of one population or of two or more sub-populations (clusters, classes, components, species)?

Is it well described by a mixture of models, one per sub-population?

users.monash.edu.au/~lloyd/Seminars/2005-II/Mixture/index.shtml

## EM algorithm:

```estMixture ests dataSet =
let
-- [estimator]->[dataSpace] -> model of dataSpace
-- i.e. [estimator] -> estimator
...
```

Takes a list of estimators, one per component of the mixture.

```
memberships (Mix mixer components) =
let           -- memberships|Mixture
doAll (d:ds) = prepend (doOne d) (doAll ds) -- all data
doAll  []    = map (\x -> []) components

doOne  datum = normalise(                   -- one datum
zipWith (\c -> \m ->
(pr mixer c)*(pr m datum)) [0..] components)
-- pr(c) * pr(datum|c)  for class #c = m

in doAll dataSet
```

Given the components of the mixture, find (fit) the fractional membership weights of things (data) in (to) the components.

```
randomMemberships =
let
doAll seed  []    = map (\_ -> []) ests
doAll seed (_:ds) =             -- all data
let
doOne seed  []      ans = (seed, normalise ans)
doOne seed (_:ests) ans =     -- one datum
doOne (prng seed) ests
((fromIntegral(1+ seed `mod` 10)) : ans)
in let (seed2, forDatum) = doOne seed ests []
in prepend forDatum (doAll seed2 ds)

in doAll 4321 dataSet
```

Allocate initial pseudo-random (prng) fractional membership weights to things (data), not very interesting.

```
fit [] [] = []               -- Models|memberships
fit (est:ests) (mem:mems) =
(est dataSet mem) : (fit ests mems)

fitMixture mems =
Mix (freqs2model (map (foldl (+) 0) mems)) -- weights
(fit ests mems)         -- components

```

From the membership weights, calculate mixture-weights of the components and fit components (use the given estimators) to their weighted members.

```
cycle    mx = fitMixture (memberships mx) -- EM step

cycles 0 mx = mx
cycles n mx = cycles (n-1) (cycle mx)     -- n x cycle

in mixture( cycles ?? (fitMixture randomMemberships) )

```

Fit memberships to components; fit components to the memberships. Iterate, either some number of times or until convergence.

-- That's all --

## Summary

Q: What are the estimators?
A: Almost anything:

E.g. Multistate -- add up frequencies (+0.5 for MML) and normalise.
E.g. Normal -- usual mean and std dev (latter uses  /(n-1) for MML).
E.g. Multivariate -- ``product'' estimator over attribute estimators,
or a factor model estimator if you have one, etc..
E.g. A sequence-model estimator.

Mixture complexity: Can search for 1, 2, 3, ... components for the optimal message length.
L. Allison. Models for Machine Learning and Data Mining in Functional Programming, J. Functional Programming (JFP), 15(1), pp.15-32, January 2005, and also [II(click)].