0

[University Rankings]
 
 
1932 'Brave New World', 1949 'Nineteen Eighty Four', 1990s Great Firewall of China, 2010+ Wikileaks, 2013 Prism etc. (NSA, awareness of).

 
A case of 2 × 222 :  A data-set, D, consists of N pairs. A pair contains two binary (boolean) variables, (X, Y).
D's
cases
Y  
T F
X T p=#(T,T) q=#(T,F) v=p+q N=v+w
F r=#(F,T) s=#(F,F) w=r+s
Encode the data-set using two different methods. Note that each method can take advantage of positive or negative correlation between X and Y (dare one say of causality, X→Y?).
(i) Encode the data under a 4-state distribution, {1:(T,T), 2:(T,F), 3:(F,T), 4:(F,F)}:
pr1(D) = p! q! r! s! 3! / (N+3)!     (code_length1 = -log2 pr1 bits.)
(ii) Encode X as a 2-state distribution, and Y as one of two 2-states, one for each case of X:
pr2(D) = {v! w! / (N+1)!} {p! q! / (v+1)!} {r! s! / (w+1)!} = p! q! r! s! / {(N+1)! (v+1) (w+1)}
So,  pr1 / pr2 = 3! (v+1) (w+1) / ((N+2) (N+3)) < 6 / 4, and  pr2 / pr1 < (N+3) / 6.
If v ≈ w, method (i) has the shorter code length, about log21.5 less than that of method (ii) -- at most just a fraction of a bit less for the entire data-set. But if v/N→1 say, w/N→0, then pr2 > pr1, and method (ii)'s code length is up to roughly log2N bits shorter for the data-set (unbounded per data set, but < log2(N)/N per datum). (Of course, similar considerations also apply to (ii') Y; (X|Y).)
So why are the probabilities in (i) and (ii) different? Method (i) assumes a uniform prior over the 4-state's three parameters, ⟨pr((T,T)), pr((T,F)), pr((F,T))⟩. Method (ii) assumes a uniform prior over the parameter, pr(X=T), of X's 2-state, and a uniform prior on the parameter of each of Y's 2-states, pr(Y=T|X=T) and pr(Y=T|X=F). These are subtly different assumptions.
(Recall that the adaptive code (Boulton & Wallace [1969], [MML]) transmits a data-set of k-state values, [1..k]N, in log2((#1! ... #k! (k-1)!) / (N+k-1)!) bits. It is optimal for a uniform prior; for a non-uniform prior initialise the "counters" to values other than one.)

 
From observation at the local lake, about half of the birds are coots, 30% are ducks, and 20% are swans.
Most ducks and swans, say 90%, have been seen to waddle. No coot has been seen to waddle (but maybe one could), pr(B waddles|B is a coot) = 0.1, say.
Most ducks, say 90%, have been heard to quack. No coot has been heard quacking, pr(B quacks |B is a coot) = 0.1, say. Similarly for swans.
Someone reports that a certain bird, X, was observed to waddle and to quack. What species, S, is X?
pr(B is a S|B waddles & B quacks) ∝ pr(B is a S) . pr(B waddles|B is a S) . pr(B quacks|B is a S),
pr(X is a coot) ∝ 0.5 × 0.1 × 0.1 = 0.005,
pr(X is a duck) ∝ 0.3 × 0.9 × 0.9 = 0.243,
pr(X is a swan) ∝ 0.2 × 0.9 × 0.1 = 0.018,
total  0.005 + 0.243 + 0.018 = 0.266.
pr(X is a duck | X waddles, X quacks) = 0.243 / 0.266 = 0.91 -- if it walks like a duck and talks like a duck it is (probably) a duck, according to naive Bayes. (Bayes because of the use of Bayes's theorem, and naive because waddling and quacking are assumed to be independent.)

 
The Federal Court of .au ruled [FCA 65] against 'Cancer Voices [.au],' and for "US-based company Myriad Genetics and Melbourne-based Genetic Technologies, over the patent on a breast and ovarian cancer gene known as BRCA1 ... Justice John Nicholas ruled that the gene could be patented, as it had been isolated completely separately from the human body.", -- [abc][15/2/2013]. Also see FCA65@austlii [www][2/2013].  A pity, I think.
13 June 2013: Good news and worse news? "... we hold that a naturally occurring DNA segment is a product of nature and not patent eligible merely because it has been isolated, but that cDNA [complementary DNA] is patent eligible because it is not naturally occurring. ..." -- Justice Clarence Thomas, [supremecourt.gov][13/6/2013] (No. 12-398). (Also see [bbc], [the G.].)
7 October 2015, not patentable: "The [High] court [of Australia] found that while the discovery of the [BRCA-1] gene was a product of human action, to consider it an invention would stretch the law too far." -- [abc][7/10/2015].

 
The International Table Soccer Federation [ITSF] (i.e., foosball) has [rules] and videos of past championships [www] online.

 
Have finally disentangled the mathematics in the various meandering explanations of the von Mises - Fisher probability distribution on directions in RD and of MML-ing it.

 
In some cultures sons are valued more than daughters and the male:female sex ratio at birth is much higher than one (ultra-sound, abortion, ...); the ratio is reported to be as high as 1.19:1 in China (WDB). Fisher (1930) showed that natural selection drives the ratio to 1:1 : Every child has one mother and one father. If there is an excess of males, a male has a lower chance of having children than a female. (And v.v. if there is an excess of females.) So, someone having a daughter in such a culture is more likely to have grandchildren than someone having a son. A tendency to have daughters is being selected for. Just give nature time.
(Note, selection drives the ratio at reproductive age to 1:1. The argument does not hold for all species, e.g., where females have multiple young, over time, after a single mating, say. Search for [sex ratio biology] in the [Bib].)

 
The alien computer design in A for Andromeda (1961) still looks more than a match for a human in terms of neuron numbers, but not in synapses; there again, there's the matter of speed.

 
The stable marriage problem featured in the 2012 Nobel prize for Economics.

 
Dilbert is a documentary.

 
The good old Iterated Prisoners' Dilemma (IPD).

 
Enumerating all sequences of n pairs of matched brackets is equivalent to generating rooted, ordered, k-ary trees.

 
The Jacobi algorithm finds Eigen things of a real, symmetric matrix.

 
I really wish I had invented the Burrows Wheeler transform, in which case it would not be known as the BWT.

more↑   >next>