The Types of Models
[01]
>>
The Types of Models
Lloyd Allison,
School of Computer Science and
Software Engineering,
Monash University,
Australia 3800.

[20+5]
[more (click)]

Abstract:
The specification of various kinds of statistical model
from machine learning and data mining
is examined formally using the type and class system of the
functional programming language Haskell as a metalanguage.
Types and classes (in the programming sense) of models, and
operations on models, are defined; many are naturally polymorphic.
Convenient conversion functions map between
the classes of models and extend their range of usefulness.
The result is a kind of theory of programming with models,
not only of using them.
The ``theory'' can run as an executable Haskell program or
can throw light on the foundations of platforms for programming
with statistical models.
Presented at the
Second Hawaii International
Conference on Statistics and
Related Fields (HICS03),
Honolulu, 2003 June 58.
This
document can be found at
users.monash.edu.au/~lloyd/Seminars/2003HICS/index.shtml
and includes hyperlinks to other resources.
The Types of Models
<<
[02]
>>
The problem & questions

 What are the products of machinelearning research?
AI, data mining, computational statistics,...

 How [ do  could ] they behave?

 What can you do to them?

 How can two or more of them be combined?


 What are their types and classes?


c.f. SPlus, R, Weka, etc..
Also see [rules].
The Types of Models
<<
[03]
>>
Inspiration
 Functional Programming
i.e. programming with functions

 lambdacalculus,
Lisp, ..., SML, and
Haskell98

 highorder functions (e.g. map, fold, zip,... etc.)

 lazyevaluation
(e.g. ints = 0 : (map ((+) 1) ints))

 polymorphic types
(e.g. map :: (t>u) > [t] > [u])

 automatic typeinference

 typeclasses

 type checked & compiled (fast)

So, trying to create ``programming with (statistical) models''
here
(using Haskell98) ...
The Types of Models
<<
[04]
>>
... programming with models.
Conclusion (i)
 we get:

 highorder (functions of) models _{ }
 e.g. estimate ClassificationTree parameterized by (estimate Leaf Model)
 e.g.^{ }ClassificationTree > RegressionTree

 polymorphic typed (& checked) models _{ }
 e.g. a Model of some Bounded Enum <dataspace>
 e.g.^{ }FunctionModel of <inspace> to <opspace>

 classes of statistical model _{ }
 e.g. (basic) Model, FunctionModel, TimeSeries, ¿other?

 (and lazy evaluation is useful in some algorithms)
NB. Some slight abuse of the type system.
The Types of Models
<<
[05]
>>
Example : (i) Mixture modelling
estMixture ests dataSet = let
...
... (22 lines of (E.M.) algorithm)
...
in mixture( ... .)
estMixture ::
 [ [dataSpace] > [Float] > Model of dataSpace ] ^{ }
 wtd estimators
 > [ dataSpace ] ^{ } _{ }
 training data
 > (Mixture) Model of dataSpace
unsupervised classification, clustering,
yes it works (slight abuse of types above).
NB. Mixture modelling chosen as an example `only'
because it is important in AI, ML, DM, CS.
Explain list [ ], function >.
The Types of Models
<<
[06]
>>
Example : (ii) Classification (decision) Trees
estCTree estLeafMdl splits ipSet opSet = let
...
... (32 lines of code)
...
in ...
estCTree ::
 ( [opSpace] > Model of opSpace ) ^{ }
 leaf model est'
 > ( ipSpace > [ ipSpace > Int ] ) ^{ }
 partitioning
 > [ipSpace] > [opSpace] ^{ }
 training data
 > CTree ipSpace opSpace ^{ }
 an instance of
FunctionModel ipSpace opSpace

  roughly (& it works),
types are inferred,
supervised classification,
tree chosen as `just' another standard AI example.
The Types of Models
<<
[07]
>>
turn a classificationtree into a FunctionModel (regression) tree

 ft = estCTree
(estFunctionModel2estModel
estLinearModel)
 e.g.
 splits ^{ }
 trainingIp ^{ } trainingOp


 estFunctionModel2estModel
_{ } estFM
ipOpPairs =
 functionModel2model
(uncurry estFM (unzip ipOpPairs))


 and the types are inferred and checked.


 (E.g. Similarly, FunctionModelmixtures, etc..)
The Types of Models
<<
[08]
>>
How (i) : (basic) Models
 The most important property of a (class of) statistical model
is ``pr'':


 class Model mdl where

 pr
:: (mdl dataSpace) > dataSpace > Probability
 msg2
:: (mdl dataSpace) > dataSpace > MessageLength
 (datamodel)

 msg :: . . .
(mdl dataSpace)
> dataSpace
> MessageLength
 model + datamodel


  a minimum; maybe^{[*]} a Model
can also do other things.
(^{[*]} probably!)
Explain: must do something about overfitting, use
minimum message length (MML: msg, msg2),
but could use any compositional criterion.
The Types of Models
<<
[09]
>>
example : a Model of non ve Ints
n  code_{2}  bits  cat'n  cum' cat'n 
0  0  1  1  1 
1  100  3  1  2 
2  10100  5  2  4 
3  11000  5 
4  1010100  7  5  9 
...  ...  ... 
wallaceIntModel =  1 3 5 5 7 ...
let
catalans =  1 1 2 5 14 ...
let cats last n =
let next = last*2*(2*n1) `div` n
in (next `div` (n+1)):(cats next (n+1))
in 1 : (cats 1 1)
cumulativeCatalans = scanl1 (+) catalans
in
MMsg 0 (\n > (find n 0 cumulativeCatalans)*2+1)
Laziness useful.
MMsg a constructor of a type which is an instance of Model.
No parameters so msg length is zero.
The Types of Models
<<
[10]
>>
How (ii) : other classes of statistical models
 class FunctionModel fm where

 condModel
:: (fm inSpace opSpace) > inSpace
> ModelType opSpace

 condPr
:: (fm inSpace opSpace) > inSpace
> opSpace
> Probability

 . . .


 class TimeSeries tsm where

 predictors
:: (tsm dataSpace) > [dataSpace]
> [ModelType dataSpace]

 prs
:: (tsm dataSpace) > [dataSpace]
> [Probability]

 . . .
(Also message lengths.)
E.g. regressions, linear regressions, and Markov models, etc..
One day even more classes.
The Types of Models
<<
[11]
>>
SuperModels
Our classes have some common properties;
we need a superclass. Obviously...
 class SuperModel sMdl where
 prior :: sMdl > Probability
 mixture
:: (Mixture mx, SuperModel (mx sMdl)) =>
mx sMdl > sMdl
 msg1 :: sMdl > MessageLength


 class Mixture mx where
 mixer :: (SuperModel t) => mx t > ModelType Int
 components :: (SuperModel t) => mx t > [t]


 instance SuperModel (ModelType dataSpace) where
 msg1 (MPr mdlLen p) = mdlLen
 . . . etc.
The Types of Models
<<
[12]
>>
The Types of Models
<<
[13]
>>
Conversion functions ~ generality
e.g.
 functionModel2model fm ::
(FunctionModel fm) =>
 fm ipSpace opSpace
^{ }> Model of (ipSpace, opSpace)

 &

 estFunctionModel2estModel ::
 ^{ }( ipSpace > opSpace
> FunctionModel of ipSpace opSpace )
> ( [ (ipSpace, opSpace) ]
^{ }> Model of (ipSpace, opSpace) )

(Also recall use with [tree]models.)
(slight abuse of types)
The Types of Models
<<
[14]
>>
Rules of the game
(self imposed
 recall [problem])
 Not:
 _{ }a library of heavyweight machine learning algorithms,
 _{ }external subroutine calls
 _{ }a library of `interpreted' structures, or
 another specialpurpose statistical language.

 Rather:
 _{ }use an existing language (Haskell98)
 _{ }create a ``standard prelude'' of
models (c.f. Lists)
 _{ }so that estimators & models are
wellintegrated with functions, lists, etc.,
 _{ }small building blocks,
e.g. bivariate, model2timeSeries, etc.,
 _{ }use the language's standard features 
 typeclasses, polymorphic types, static typechecking,
highorder functions, lazyevaluation, lists, etc.
because: want some sort of theory/ model/ semantics/ prelude
of models and modelling.
The Types of Models
<<
[15]
Conclusion (ii)
 (basic) Models
 e.g. univariate, multivariate, probability distributions

 FunctionModels
 e.g. regressions, classification trees, regression trees

 TimeSeries
 e.g. Markov models

 Mixtures of the above

 Conversion functions on classes of models and estimators.

 Polymorphic types, statically checked.
Also recall slide <04<.
 Allison, L., Powell, D., Dix, T. I.:
Compression and Approximate Matching,
Computer Journal 42(1) (1999) pp.110 (now generalized)
 Farr, G. E., Wallace, C. S.:
The Complexity of Strict Minimum Message Length Inference,
Computer Journal 45(3) (2002) pp.285292
 Peyton Jones, S. {et al}:
Report on the Programming Language Haskell 98,
http://www.haskell.org/ (19992003), also,
Haskell98 Language and Libraries, the Revised Report,
Cambridge U.P. (2003)
 Wallace, C. S., Freeman, P. R.:
Estimation and Inference by Compact Coding,
Journal of the Royal Statistical Society series B.
49(3) (1987) pp.240265
 Wallace, C. S., Patrick, J. D.: Coding Decision Trees.
Machine Learning 11 (1993) pp.722
6/2003 ©
L. Allison,
School of Computer Science and Software Engineering,
Monash University, Australia 3168.
Created with "vi (Linux & Solaris)", charset=iso88591