Thursday, December 29, 2022

[Book Summary - Information Theory] Information Theory: A Tutorial Introduction, Ch 5 by James V. Stone

1. Entropy: Average surprisal 

2. Discrete variables  

  • Continuous variables: divide the continuum into bins  
  • Depending the number of bins, entropy changes
    • If the number of bins increases, thus each bin size being smaller, entropy increases due to more options being made.
  • Differential entropy
  • Transforming continuous variables
    • Changing the range of a discrete set of variables doesn't change the entropy
      • Example: In the case of a binary dice, which features only 0 and 1, the output should be 0 or 1 even though either of them is divided by infinite numbers between 0 and 1 
    • Changing the range of a continuous variable does, because the entropy is based on bin-width
      • Doubling the range -> doubles the number of bins -> adds one bit, even if half of the bins aren't used
  • What about adding a constant? 
    • As the term "constant" implies, It doesn't change entropy because range (variance) remains the same. 
  • Maximum entropy distributions
    • What distribution can we engineer so that entropy is highest? 
      • Fixed upper/lower bounds 
      • Fixed mean, with all values >= 0 ==> exponential 
      • Fixed variance (e.g., power) 
  • Back to differential entropy
    • Infinitely accurately…
    • What in practice limits 'bin sizes'?
      • Noise!
      • If the amount of noise increase, it is getting harder to know what the actual signal was
    • How does noise limit bin-sized connection to transmission of information in language?
      • More noise, less precision due to the number of bins being smaller because of noise (i.e., log1/delta x decreases)
      • Zipf's law: Infrequent words => longer, lexicon is limited        
  • Bin size => Amount of signal! 

No comments:

Post a Comment