0
- [University Rankings]
-
-
1932 'Brave New World',
1949 'Nineteen Eighty Four',
1990s Great Firewall of China,
2010+ Wikileaks,
2013 Prism etc.
(NSA, awareness of).
- A case of 2 × 2 ≠ 2 ⊕ 2 :
A data-set, D, consists of N pairs. A pair contains two binary (boolean) variables, (X, Y). D's
casesY T F X T p=#(T,T) q=#(T,F) v=p+q N=v+w F r=#(F,T) s=#(F,F) w=r+s - Encode the data-set using two different methods. Note that
each method can take advantage of positive or negative correlation
between X and Y (dare one say of causality, X→Y?).
- (i) Encode the data under a 4-state distribution†, {1:(T,T), 2:(T,F), 3:(F,T), 4:(F,F)}:
- pr1(D) = p! q! r! s! 3! / (N+3)!
(code_length1 = -log2 pr1 bits.) - (ii) Encode X as a 2-state distribution, and Y as one of two 2-states, one for each case of X:
- pr2(D)
= {v! w! / (N+1)!}
{p! q! / (v+1)!} {r! s! / (w+1)!}
= p! q! r! s! / {(N+1)! (v+1) (w+1)} - So,
pr1 / pr2
= 3! (v+1) (w+1) / ((N+2) (N+3)) < 6 / 4, and pr2 / pr1 < (N+3) / 6. - If v ≈ w, method (i) has the shorter code length, about log21.5 less than that of method (ii)
-- at most just a fraction of a bit less for the entire data-set.But if v/N→1 say, w/N→0, thenpr2 > pr1 , and method (ii)'s code length is up to roughly log2N bits shorter for the data-set (unbounded per data set, but < log2(N)/N per datum). (Of course, similar considerations also applyto (ii') Y; (X|Y) .)- So why are the probabilities in (i) and (ii) different? Method (i) assumes a uniform prior over the 4-state's three parameters, 〈pr((T,T)), pr((T,F)), pr((F,T))〉. Method (ii) assumes a uniform prior over the parameter, pr(X=T), of X's 2-state, and a uniform prior on the parameter of each of Y's 2-states, pr(Y=T|X=T) and pr(Y=T|X=F). These are subtly different assumptions.
- If v ≈ w, method (i) has the shorter code length, about log21.5 less than that of method (ii)
- †(Recall that the adaptive code
(Boulton & Wallace
[1969],
[MML])
transmits a data-set of k-state values,
[1..k]N, in
log2((#1! ... #k! (k-1)!) / (N+k-1)!) bits. It is optimal for a uniform prior; for a non-uniform prior initialise the "counters" to values other than one.)
- From observation at the local lake, about half of the birds are coots, 30% are ducks, and 20% are swans.
-
Most ducks and swans, say 90%, have been seen to waddle.
No coot has been seen to waddle (but maybe one could),
pr(B waddles|B is a coot) = 0.1, say.- Most ducks, say 90%, have been heard to quack. No coot has been heard quacking,
pr(B quacks |B is a coot) = 0.1, say. Similarly for swans. - Most ducks, say 90%, have been heard to quack. No coot has been heard quacking,
- Someone reports that a certain bird, X, was observed to waddle and to quack. What species, S, is X?
-
pr(B is a S|B waddles & B quacks) ∝ pr(B is a S) . pr(B waddles|B is a S) . pr(B quacks|B is a S),† - pr(X is a coot) ∝ 0.5 × 0.1 × 0.1 = 0.005,
pr(X is a duck) ∝ 0.3 × 0.9 × 0.9 = 0.243,
pr(X is a swan) ∝ 0.2 × 0.9 × 0.1 = 0.018,
total 0.005 + 0.243 + 0.018 = 0.266. - pr(X is a coot) ∝ 0.5 × 0.1 × 0.1 = 0.005,
-
pr(X is a duck | X waddles, X quacks) = 0.243 / 0.266 = 0.91
-- if it walks like a duck and talks like a duck it is
(probably) a duck, according to naive Bayes.
(Bayes because of the use of
Bayes's
theorem†, and
naive because waddling and quacking are assumed to be independent.)
- The Federal Court
of .au ruled [FCA 65]
against 'Cancer Voices [.au],' and
for "US-based company Myriad Genetics and
Melbourne-based Genetic Technologies, over the
patent on a breast and ovarian cancer gene known as BRCA1 ...
Justice John Nicholas ruled that the gene could be patented, as it
had been isolated completely separately from the human body.",
-- [abc][15/2/2013]. Also seeFCA65@austlii [www][2/2013]. A pity, I think. - 13 June 2013:
Good news and worse news?
"... we hold that a naturally occurring DNA
segment is a product of nature and not patent eligible
merely because it has been isolated, but that cDNA [complementary DNA]
is patent eligible because it is not naturally occurring. ..."
-- Justice Clarence Thomas,
[supremecourt.gov][13/6/2013] (No. 12-398). (Also see [bbc], [the G.].)- 7 October 2015, not patentable: "The [High] court [of Australia] found that while the discovery of the [BRCA-1] gene was a product of human action, to consider it an invention would stretch the law too far."
-- [abc][7/10/2015]. - 7 October 2015, not patentable: "The [High] court [of Australia] found that while the discovery of the [BRCA-1] gene was a product of human action, to consider it an invention would stretch the law too far."
- The International Table Soccer Federation
[ITSF]
(i.e., foosball) has
[rules]
and videos of past championships
[www] online.
- Have finally disentangled the mathematics in the
various meandering explanations of the
von Mises - Fisher probability distribution on directions in RD and of MML-ing it.
- In some cultures sons are valued more than daughters and the male:female sex ratio at birth is much higher than one (ultra-sound, abortion, ...); the ratio is reported to be as high as 1.19:1 in China (WDB). Fisher (1930) showed that natural selection drives the ratio to 1:1 : Every child has one mother and one father. If there is an excess of males, a male has a lower chance of having children than a female. (And v.v. if there is an excess of females.) So, someone having a daughter in such a culture is more likely to have grandchildren than someone having a son. A tendency to have daughters is being selected for. Just give nature time.
- (Note,
selection drives the ratio at reproductive age to 1:1.
The argument does not hold for all species, e.g.,
where females have multiple young, over time, after a single mating, say.
Search for
[sex ratio biology] in the [Bib].)
- The alien computer design in
A for Andromeda
(1961) still looks more than a match for a human
in terms of neuron numbers, but not in synapses;
there again, there's the matter of speed.
- The
stable marriage problem
featured in the 2012 Nobel prize for Economics.
- Dilbert
is a documentary.
- The good old
Iterated Prisoners' Dilemma (IPD).
- Enumerating all sequences of n pairs of
matched brackets
is equivalent to generating rooted, ordered, k-ary trees.
- The
Jacobi
algorithm finds Eigen things of a real, symmetric matrix.
- I really wish I had invented the
Burrows Wheeler
transform, in which case it would not be known as the BWT.