The “20 Questions Problem”

Let be a random variable.

  • What is the minimum amount of boolean questions needed to identify ?
  • Which questions should be asked
  • Consider a binary code for
  • Once is fixed, we know > we know the codeword
  • We ask the th question in order to obtain the th bit
  • The average number of questions is , which is minimized when is a Huffman code

The sequence of Yes/No answers is a binary codeword associated to

Entropy and Sorting

Sorting with pairwise comparisons of two elements, with binary output Either or

But how many binary comparisons are needed ?

Billiard Balls

We have 14 balls, and we have a ball as a reference (we know it’s normal). At most one ball can be heavier/lighter than the others.

How can find which ball is different ?

We can identify the solution with a random variable

We have Then the we have the size of the alphabet Therefore the amount of ternary symbols needed to represent the source is at least

Which we have the the amount of comparisons as

Hence we need to do at least 3 comparisons (weighings)

Most algorithms can be deduced as a tree of decisions

01.6 Prediction, Learning and Cross-Entropy Loss