经典信息论#1
为什么突然开始学经典信息论?主要是给量子信息稍微打下基础。可能不会学很多。
Entropy
首先,熵是什么?
The entropy of X is defined by
A measure of the uncertainty of a random variable.
一个永远大于0的家伙。
对于均匀分布来说,
When X is uniform over \mathcal{X}, then H(X)=\log|\mathcal{X}|.
如果要规定log底,那么一般表示为H_b(X)=\log_baH_a(X).
The logarithm is to the base 2 and the unit is bits. If the base of the logarithm is b, we donote of the entropy by H_b(X). If b=e, the entropy is measured in nats.
Unless otherwise specified, the entropies will be measuerd in bits.
example
X=1 with probablity p
X=0 with probablity 1-p
then H(X)=-p\log p-(1-p)\log(1-p)
我们还可以从期望的角度来看:
We denote expectation by E. If X\sim p(x), the expected value of the random variable g(X) is writtenE_{p}g(X)=\sum\limits_{x\in X}g(x)p(x), then H(X)=E_{p}\log\frac{1}{p(X)}.
For a discrete random variable X defined on \mathcal{X},
熵和概率论关系密切,很多深层次原理其实还是概率论。
Joint Entropy
联合熵
The joint entropy H(X,Y) of a pair of discrete random variable (X,Y) with joint distribution p(x,y) is defined as
再往下还能接着定义
For a set of random variables X_1,\dots,X_n with joint distribution p(x_1,\dots,x_n),its joint entropy is defined as
Conditional Entropy
条件熵
When X=x is known, p(Y|X=x) is also a probability distribution
所有可能性加起来自然是等于1的。
那么我们定义
If (X,Y)\sim p(x,y), the conditional entropy H(Y|X) is defined as
When X is known, the remaining uncertainty of Y\colon H(Y|X)\leq H(Y).