How do we communicate ? We reveal the value of a sequence of variables that we call symbols
Hartley
Information is measured in → because the sum of information is the amount of possibilities
Example
If there are weather states, then the sum of the information for weather in two places is → because there are possibilities of weather configurations
But something is not quite right about this way to measure this information
Example
Something really unlikely is just “as informative” as a really likely event
Entropy
Quantifies “randomness”
Definition
Which can also be written as We also have the definition
Binary Entropy
When , we have two possible values and The entropy is given by the binary entropy function :
Lemma
For a positive real , we have with equality
Proof
Since all logarithms are “equal”, we can prove this using the natural log
Theorem
The entropy of a discrete random variable satisfies with equality on the left for a singular , and equality on the right
Proof
Inequality on the left → → really rare and weird
Proof
Inequality on the right → We prove that = - \sum_s p(s) \log p(s) - \log|\mathcal A| =\sum_s p(s) \left\{ -\log p(s) - \log |\mathcal A| \right\}$$$$=\sum_s p(s) \left( -\log (p(s) |\mathcal A|) \right)=\sum_s p(s) \log \frac1{p(s)|\mathcal A|}$$$$\le\sum_s p(s) \left[ \frac1{p(s)|\mathcal A|} - 1 \right] \log e = \left\{ \sum_s \frac1{|\mathcal A|} - \sum_s p(s) \right\} \log e \le 0