How much Information can one learn about a Binomial Distribution?

The binomial provides a simple example about information,
is important, e.g., consider: {+ve, - ve}, {true, false}, {H, T}, {purine, pyrimidine}, {hydrophilic, hydrophobic}, . . . , and
is a component of many structured models, e.g.,: Markov model, classification tree (etc.), Bayesian net, segmentation (of binary seq. or image), . . .
2-state:: Pr(H) = θ, 0 ≤ θ ≤ 1,; Pr(T) = 1 - θ
Binomial, n trials:: Pr(#H = k) = ⁿC_k θ^k (1 - θ)^n-k, 0 ≤ k ≤ n
How much information can one learn about θ? . . .

( BTW the question is important, e.g.,; the win/loss record of an AFL team:; L W L L W L W W L W L W; The team changed coach after round 6. Has it paid off?; (i) 6 from 12 (null, any difference is due to chance) v.; (ii) 2 from 6 ++ 4 from 6; Including the information of two distributions, (ii), v. just one distribution, (i), guards against overfitting. )
. . . How much information can one learn about θ? . . .

. . . but they do keep in contact.
Jill sends Jack (i) n, (ii) C.B. entry, & (iii) data | entry.
In (ii) Jill chooses an entry (theory, estimate) for θ out of the code book.

How would Jack and Jill design an optimal code book?

2ⁿ possible sequences of length n, but

wrt θ, only #H matters.

#H ∈ [0, n].

The book's page for a given 'n' corresponds to a partition of [0, n]

Generally the extreme possibilities are not optimal.

And they chose a prior for θ (uniform is simplest option).

There is a polynomial-time algorithm to design the page for 'n':

Complete DAG (!-)
The weight of an edge is the contribution to the expected message-length of the corresponding part of the partition of [0, n].
Seek the shortest path from far left to far right.

Demo:: n=

needs JavaScript on
Observing #H, Jill selects a code book entry (interval [lo, hi], estimate θ) and transmits the entry and the data given that entry.
The amount of information transmitted about θ is - log( (hi - lo + 1) / (n + 1) ).

Bad news:: There is no polynomial-time algorithm to design the code book for the multinomial if k = |{Values}| ≥ 4, and; the case for the trinomial is open [1].
Good news:: There are fast algorithms for accurate estimates of the message-length for the multinomial, and many other useful distributions and models.
[1] G. E. Farr & C. S. Wallace, The Complexity of Strict Minimum Message Length Inference, Computer J., 45(3), pp.285-292, 2002.

Created with "vi (Linux & Solaris)", charset=iso-8859-1