[01]
>>
Types and Classes of
Machine Learning and
Data Mining
Lloyd Allison,
CSSE,
Monash University,
Australia 3800
In:
26th Australasian Computer Science Conference (ACSC2003),
Adelaide, South Australia, pp207--215,
4-7 February 2003,
Conferences in Research and Practice in Information Technology, Vol.16
(Australian Computer Science Communications,
Volume 25,
Number 1).
|
[paper (click)],
also see [II (click)]
|
Abstract:
The notion of a statistical model, as inferred and used in
statistics, machine learning and data mining,
is examined from a semantic point of view.
Data types and type-classes for models are developed
that allow models to be manipulated in a type-safe yet flexible way.
The programming language Haskell-98, with its system
of polymorphic types and type-classes,
is used as the meta-language for this exercise
so one of the by-products is a running program.
This
document can be found at
users.monash.edu.au/~lloyd/Seminars/2003-ACSC/index.shtml
and includes hyper-links to other resources.
<<
[02]
>>
"... considered as a biological phenomenon,
aesthetic preferences stem from a predisposition among animals and men
to seek out experiences through which they may
learn to classify
the objects in the world about them. Beautiful `structures'
in nature or in art are those which facilitate the task of
classification by presenting evidence of the `taxonomic' relations between
things in a way which is informative and easy to grasp."
-- N. K. Humphrey.
The illusion of beauty.
Perception 2, pp. 429-439, 1972.
<<
[03]
>>
- H' argues a sense of beauty is a by-product(?)
of (useful) ability to classify.
-
-
- Classification is about similarity and difference.
-
- 1. Unsupervised &
supervised
classification are important problems in M.L. and D.M..
-
- 2. Notice similarity of many products
and of many activities in M.L. and D.M. research themselves.
-
- Here, want to make precise these similarities and differences.
(Efficiency can be addressed,
but is a secondary consideration today.)
<<
[04]
>>
"Model" and "Class"
Class | As in OOP |
Class | A number of individuals [...] possessing common attributes... |
Class | A division or order of society... |
Class | Natural History. One of the highest groups... |
Model Class | As in Statistics |
Model [citizen] | An exemplar |
Model | A person [...] who is employed to display clothes... |
Model | A summary, epitome, or abstract... |
Model | A description of structure... |
| ~ Class as in OOP! |
Some meanings, most from O.E.D.
<<
[05]
>>
- Shall use Haskell 98
-
- lazy functional programming (FP) language,
-
- polymorphic types,
e.g. map :: (t->u) -> [t] -> [u],
(t, u type params, [...] list,
-> function)
-
- type classes,
-
- type inference algorithm,
((abused) types given here,
but really inferred automatically.)
-
-
- to describe ``statistical models''
for want of a term.
<<
[06]
>>
(Basic) Models.
MMLFP = MML + FP
Most important property of a (class of) statistical model
is ``pr'':
- class Model mdl where
- pr
:: (mdl dataSpace) -> dataSpace -> Probability
- msg2
:: (mdl dataSpace) -> dataSpace -> MessageLength
-- (2nd part)
- msg :: . . .
(mdl dataSpace)
-> dataSpace
-> MessageLength
- -- a minimum; maybe* a Model
can also do other things.
(* probably!)
<<
[07]
>>
Examples
normal m s |
:: |
Model of Float |
freqs2model |
:: |
[Int] -> Model of [0..n-1] |
bivariate |
:: |
(Model of d1)
-> (Model of d2)
-> Model of (d1, d2) |
etc.
|
NB.
Slight abuse of Haskell type notation, 'cos
`Model' is a class not a type.
<<
[08]
>>
Some other classes of statistical model
FunctionModels
- class FunctionModel fm where
-
- condModel
:: (fm inSpace opSpace) -> inSpace
-> ModelType opSpace
-
- condPr
:: (fm inSpace opSpace) -> inSpace
-> opSpace
-> Probability
-
- condMsg2
:: (fm inSpace opSpace) -> inSpace
-> opSpace
-> MessageLength
-
-
- e.g.
linear a b eps :: FunctionModel of Float Float
-
- i.e. y ~ a × x + b + (normal 0 eps)
- . . .
<<
[09]
>>
. . . and TimeSeries
- class TimeSeries tsm where
-
- predictors
:: (tsm dataSpace) -> [dataSpace] -> [ModelType dataSpace]
-
- prs
:: (tsm dataSpace) -> [dataSpace] -> [Probability]
-
- msg2s
:: (tsm dataSpace) -> [dataSpace] -> [MessageLength]
-
-
- e.g. markov n :: TimeSeries of someDiscreteType
More? Surely!
(Slight abuse of Haskell type notation.)
<<
[10]
>>
SuperModels
Our classes have some common properties;
we need a super-class. Obviously...
- class SuperModel sMdl where
- prior :: sMdl -> Probability
- msg1 :: sMdl -> MessageLength
- mixture
:: (Mixture mx, SuperModel (mx sMdl)) =>
mx sMdl -> sMdl
-
- class Mixture mx where
- mixer :: (SuperModel t) => mx t -> ModelType Int
- components :: (SuperModel t) => mx t -> [t]
-
-
- instance SuperModel (ModelType dataSpace) where
- msg1 (MPr mdlLen p) = mdlLen
- . . . etc.
<<
[11]
>>
conversion functions
<<
[12]
>>
Mixture modelling
(clustering, unsupervised classification,
Snob,...)
estMixture ests dataSet = let
...
... (22 lines of code)
...
in mixture( ... .)
estMixture
::
[ [dataSpace] -> [Float] -> Model of dataSpace ]
--
estimators
- -> [ dataSpace ]
-- training data
- -> (Mixture) Model of dataSpace
yes, it works
<<
[13]
>>
estCTree estLeafMdl splits ipSet opSet = let
...
... (32 lines of code)
...
in ...
estCTree
::
( [opSpace] -> Model of opSpace )
--
leaf model est'
- -> ( ipSpace -> [ ipSpace -> Int ] )
-- partitioning
- -> [ipSpace] -> [opSpace]
-- training data
- -> CTree ipSpace opSpace
-- an instance of
FunctionModel ipSpace opSpace
-
- -- roughly (and it works)
<<
[14]
>>
Generality
E.g. CTree is more than a (C5) classification-tree....
- estFunctionModel2estModel
estFn
ipOpPairs =
- functionModel2model (uncurry estFn (unzip ipOpPairs))
-
- ft = estCTree
(estFunctionModel2estModel
estFiniteFunction)
-- e.g.
- splits
- trainingIp trainingOp
-
- -- in effect a FunctionModel-tree, i.e. a regression-tree,
automatically, for little effort.
- Turn
an estimator for a FunctionModel into
an estimator for a Model for use with estCTree.
NB. Can use estimators other than estFiniteFunction!!
-
- (E.g. Similarly, FunctionModel-mixtures, etc..)
<<
[15]
Conclusions
A good summer collection
- Models, e.g.
probability distributions,
mixtures (unsupervised classification).
- FunctionModels, e.g. curve fitting,
regressions,
classification trees
(supervised classification),
regression trees.
- TimeSeries, e.g. Markov models.
- Operators and conversion functions on the above.
- General, e.g. estimate a mixture of FunctionModels,
estimate a FunctionModel- (regression-) -tree,
etc..
- Have a model of modelling:
A theory,
usable in its own right
(it runs),
a rapid-prototype
for a data mining platform.
- [paper (click)]