We investigate the role of sufficient statistics in generalized probabilistic
data mining and machine learning software frameworks. Some issues involved
in the specification of a statistical model type are discussed and we show that
it is beneficial to explicitly include a sufficient statistic and functions
for its manipulation in the model type's specification.
Instances of such types can then be used by generalized learning
algorithms while maintaining optimal learning time complexity.
Examples are given for problems such as incremental learning and
data partitioning problems (e.g. change-point problems, decision trees and