The estimation of Bayesian networks makes
a case-study to illustrate
A Bayesian network can be over discrete or continuous attributes (variables), or over discrete and continuous (mixed) attributes.
The main interests here are in
the types and classes of the networks, and
the expressiveness of I.P..
The estimator below is given a perm-utation of the variables
but a search over permutations could also be programmed,
without change to the types or classes.
estNetwork perm estMV dataSet = let n = length perm search _ [] = [] -- done search ps (c:cs) = let -- use parents ps to predict children c:cs opFlag = ints2flags [c] n -- identify the child c ... ipFlags = ints2flags ps n -- ... and the parents ps. cTree = estCTree (estAndSelect estMV opFlag) -- leaf est (splitSelect ipFlags) -- allowed tests dataSet dataSet -- ! in cTree : (search (c:ps) cs) -- i.e. list of class'n trees trees = search [] perm -- network msgLen = sum(map msg1 trees) -- cost nlP datum = sum(map (\t->condNlPr t datum datum) trees) --neg log Pr in MnlPr msgLen nlP (\() -> "Net:" ++ (show trees)) --a Model
Each node of a Bayesian network contains a classification-tree. A classification-tree can split (test) on both discrete and continuous attributes, and it can model discrete or continuous variables (or other types) in its leaves.
The Bayesian network estimator uses
the estimator for
a new type of model, ModelMV...
ModelMV is a new model type for multivariate data-spaces.
data ModelMV dataSpace = MnlPrMV ([Bool] -> MessageLength) -- msg1 ([Bool] -> dataSpace -> MessageLength) -- msg2 ([Bool] -> String) -- Show Int -- width
Certain dimensions can be 'select'-ed by Boolean flags...
Class Project, as in 'projection', allows the Bayes Net estimator to select input- and output-subspaces of the data-space to be 'parents' and 'children'.
class Project t where select :: [Bool] -> t -> t -- non-selected parts become "identities" selAll :: t -> [Bool] -- i.e. all "on" flags
An instance of class Project is some 'multi-dimensional' value where certain dimensions can be 'select'-ed.
(In addition, a class Splits2, a variation on the standard class Splits, enables independent and dependent dimensions of the data-space to be selected for the estimator of classification trees, as used in the nodes of a Bayesian network.)
instance Project (ModelMV ds) where select bs (MnlPrMV m n s w) = MnlPrMV (\bs' -> m (zipWith (&&) bs bs')) (\bs' -> \datum -> n (zipWith (&&) bs bs') datum) (\bs' -> s (zipWith (&&) bs bs')) w selAll (MnlPrMV _ _ _ w) = replicate w True
It took one day to implement the Bayesian networks case-study in
As an application,
a Bayesian network was learned for search and rescue data
--