Lloyd Allison,
School of Computer Science & Software Engineering,
To: AIG,
Seminar's area: Inductive inference,
Research problem: What are statistical models?
I.e. What do they do?
What can be done to them?
How can you combine two or more of them?
And what do you get?
This talk describes an approach and gives practical examples.
users.monash.edu.au/~lloyd/Seminars/200410-II/index.shtmlGiven a new problem in general computing,
the probability of having an existing solution is ~0, ...
programming languages (e.g. Haskell)
exist to ease writing new solutions.
Given a new inductive inference problem,
the probability of having an existing solution is ~0, ...
<?what?> exists to ease writing new solutions?
repeat {
new postgrad starts in machine learning;
devises new model (or variation);
does some maths (not just a hacker);
implements search & estimators (practical too!);
does comparative tests v. other methods (beats them);
writes thesis, gets degree & leaves.
}
We are often left with programs that are hard to (re-) use
--
@0 -- continuous
@1 -- Boolean
@2 -- Boolean
@3 -- Boolean
@4 -- continuous.
Bayesian networks can "learn & explain" relationships between attributes (variables).
N. Friedman and M. Goldszmidt.
suggested using classification trees in the nodes of a Bayesian network.
Oh good, we just happen [1] to have:
the toy e.g. (5 attributes)
details of the network's nodes next...
{CTleaf N(1.0,0.41)(+-0.1),_,_,_,_}, --@0 {CTleaf _,mState[0.5,0.5],_,_,_}, --@1 {CTfork @0<|>=1.4[ --@2|@0,@1 {CTleaf _,_,mState[0.99,0.01],_,_}, {CTfork @1=False|True[ {CTleaf _,_,mState[0.98,0.02],_,_}, {CTleaf _,_,mState[0.02,0.98],_,_}]}]}, {CTleaf _,_,_,mState[0.5,0.5],_}, --@3 {CTfork @2=False|True[ --@4|@0,@2 {CTfork @0<|>=1.0[ {CTleaf _,_,_,_,N(0.55,0.2)(+-0.1)}, {CTfork @0<|>=1.4[ {CTleaf _,_,_,_,N(1.0,0.2)(+-0.1)}, {CTleaf _,_,_,_,N(1.45,0.2)(+-0.1)}]}]}, {CTleaf _,_,_,_,N(3.45,0.2)(+-0.1)}]}N.B. Don't get the nodes of the network and of a tree confused!
mState[p0,p1,...] -- multi-state distribution[*].
N(m,s) -- normal (Gaussian) distribution[*].
"classification" tree[*],
CTleaf -- model[*] an attribute,
CTfork -- test[*] an attribute.
([*] message lengths, i.e. complexities, direct search & prevent over-fitting.)
data Tipe = Alzheimers | Child | Despondent | Hiker | ...
type Age = Double
data Race = White | Black deriving (Eq, Enum,...
data Gender = Male | Female deriving (Eq, Enum,...
data Topography = Mountains | Piedmont | Tidewater deriving...
data Urban = Rural | Suburban | Urban deriving (Eq,Ord,...
type HrsNt = Double -- hours notified
type DistIPP = Double -- distance
...
type MissingPerson = (Maybe Tipe, Maybe Age, ...)
e0 = (estModelMaybe estMultiState) --Tipe
e1 = (estModelMaybe (estNormal 0 90 1 70 0.5)) --Age
...
e7 = (estModelMaybe (estNormal 0 50 0.5 30 0.2)) --DistIPP
estimator = estVariate15 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11 e12 e13 e14
NB. There is(n't:-) a lot of missing data.
Net:[ -- @1, Age: {CTleaf _,(Maybe 50:50,N(40.6,27.5)... }, ... etc. ... -- @6, HrsNt: {CTfork @1(<|>=62.0|?)[ {CTleaf...,(Maybe 50:50,N( 8.7, 7.6)...}, {CTleaf...,(Maybe 50:50,N(21.4,26.3))...}, {CTleaf...,(Maybe 50:50,N(20.0,..1-case)..} ]}, ... ] network: 115.1 nits + data: 5396.6 nits null: 5935.6 nits (@0..@7)
estModelMaybe estModel dataSet = let present (Just _) = True present Nothing = False m1 = uniformModelOfBool -- [*] m2 = estModel (map (\(Just x) -> x) (filter present dataSet)) in modelMaybe m1 m2[*] or if modelling missing-ness...
m1 = estMultiState (map present dataSet)
class Project t where -- as in `projection' select :: [Bool] -> t -> t selAll :: t -> [Bool] -- all True flags
A type, t, in class Project
can be projected onto a
select-ed sub-space.
E.g. A 15-dim'n estimator, such as estVariate15 e0...e14.
Also a related class for select-ive partitioning of data within trees.
print warning: May lose network's lines if shrunk.e.g. |
|
| |||||||||||||||||||||||||||||||||||||||||||||||
A Network Inferred for all 15 Attributes of `lost persons' |
Have types and type-classes for Models, Function-Models and Time-Series models. Polymorphic types, type classes, high-order functions, lazy-evaluation -- all useful.
Easy to write a new model,
Other case studies include -- mixture models (clustering), time-series, Markov models, mixtures of Markov models, function models (regressions), etc..
[1] L. Allison. Types and classes of machine learning and data mining. 26th Australasian Computer Science Conference (ACSC), Adelaide, pp.207--215,