|
H's essay is a beautiful classic on beauty.
It has been scanned for the web -- google-ing the title should find it.
- A functional programming (FP) language
- functions are first-class values
- we want statistical models to be first-class values too,
- lazy
- can compute with some infinite structures,
- polymorphic types
- for uniform polymorphism,
- type-classes
- for overloading (ad-hoc polymorphism),
- type inference algorithm (so the programmer rarely states types).
- e.g.
- map f [] = [] -- base case, empty list
- map f (x:xs) = (f x):(map f xs) -- general case
- map :: (t->u) -> [t] -> [u] -- type inferred automatically
- posInts = 1 : (map (\x -> x+1) posInts) -- [1, 2, 3, ...]
-
- See: S. Peyton Jones et al.
Report on the Programming Language Haskell 98.
1 Feb 1999, and www.haskell.org
- Minimum Message Length (MML) framework,
- use of this framework is ``orthogonal'' to
the main points of the paper, but
- MML is
invariant, consistent & resistant to over-fitting, and
- is (very)
compatible with composing sub-models to form models.
- A Model is "like" a value,
-
- a FunctionModel is like a function (->)
- (e.g. a linear regression),
-
- a TimeSeries is like a list ([...])
- (e.g. a hidden Markov model).
- parameter 1
- A list [...] of weighted estimators, one per component;
- an estimator takes a data-set, i.e. [dataSpace] and
returns a Model.
- A weighted estimator takes
this and weights (for fractional assignment (unbiased results)),
and produces a Model,
- i.e. [[dataSpace] -> [Float] -> Model of dataSpace]
- -- roughly (Model is a class not a type).
- parameter 2
- a data-set, i.e. [dataSpace].
- Result
- a mixture-model of the dataSpace
-
- (Types inferred by the compiler in any case.)
-
- Algorithm - see paper.
- parameter 1
- an estimator for a leaf Model
- i.e. ( [opSpace] -> Model of opSpace )
- parameter 2
- a function that produces a list of `ways of partitioning' the input space
- i.e. ipSpace -> [ . . . ],
- a way of partioning being a function
- ipSpace -> [Int] -- roughly.
- parameters 3 and 4
- the input and output training data
- [ipSpace] & [opSpace] respectively.
- Result
- a CTree (which is a FunctionModel) of ipSpace and opSpace.
-
- (Types inferred by the compiler in any case.)
-
- Algorithm - see paper.
- estFunctionModel2estModel
- takes an (unweighted) estimator for a
FunctionModel ip op and
- returns an estimator for a Model (ip, op).
- i.e. Takes a
[ip] -> [op] -> FunctionModel of (ip,op)
- returns
[ (ip, op) ] -> Model of (ip, op).
-
- NB. unzip turns a list of pairs into a pair of lists
- (i.e. rearrange training data),
- and uncurry f x y = f (x, y) .
The important point is that
estCTree + two_line function = new application area.
Other local links
- MML
- Haskell
|
|