The Message Paradigm

  Paradigm

Tracy T. Transmitter and Richard R. Receiver get together and select a set of hypotheses, {H0, H1, ... }, to describe data, and design a code book to transmit two-part messages, where each message consists of (i) an hypothesis and (ii) a data-set given the hypothesis. This allows T and R to write encoder and decoder programs P and P-1. Naturally T and R want to use short code words in a message but, at this stage, any data are purely hypothetical and so they must design the code book based on expected data.

Then T and R move apart and the following happens . . .


 
T gets an actual data-set, D.
T chooses an H from the set.
T transmits H;D to R.
 
|msgLen| = |part1| + |part2|
part1: code(H) part2: code(D|H)
 
H;D←
decoder P-1...
...is run on some UTMR
encoder P...
...is run on some UTMT
←H;D
     
R receives H;D.
R now knows the data-set, D,
& also T's opinion, H, of D.
 

UTM : A universal Turing machine.
 
Shannon, |code(X)| = -log(pr(X)), and
Bayes, |code(H&D)| = |code(H)| + |code(D|H)| = |code(D)| + |code(H|D)|,
give  - log(pr(H|D)) ~ |code(H)| + |code(D|H)|.
 
The selection of {H0, H1, ... }, and the issue of what data each Hi best covers, must be considered together in the design of the code book.
 
Being very sensible, T will select an H that is a good model of D, but a less sensible individual might not and yet R could still recover D, although the message would be longer:
- log(pr(Hi|D) / pr(Hj|D)) = |code(Hi)|+|code(D|Hi)| - (|code(Hj)|+|code(D|Hj)|),   -- negative log posterior-odds ratio.
 
Note, depending on the application area, a data-set could be a single thing, e.g., a genome.