Topos

In information theory and statistics, Kullback's inequality is a lower bound on the Kullback–Leibler divergence expressed in terms of the large deviations rate function.^[1] If P and Q are probability distributions on the real line, such that P is absolutely continuous with respect to Q, i.e. P<<Q, and whose first moments exist, then

D_{K L} (P ‖ Q) \geq Ψ_{Q}^{*} (μ'_{1} (P)),

where $Ψ_{Q}^{*}$ is the rate function, i.e. the convex conjugate of the cumulant-generating function, of $Q$ , and $μ'_{1} (P)$ is the first moment of $P .$

The Cramér–Rao bound is a corollary of this result.

Proof

Let P and Q be probability distributions (measures) on the real line, whose first moments exist, and such that P<<Q. Consider the natural exponential family of Q given by

Q_{θ} (A) = \frac{\int_{A} e^{θ x} Q (d x)}{\int_{- \infty}^{\infty} e^{θ x} Q (d x)} = \frac{1}{M_{Q} (θ)} \int_{A} e^{θ x} Q (d x)

for every measurable set A, where $M_{Q}$ is the moment-generating function of Q. (Note that Q₀=Q.) Then

D_{K L} (P ‖ Q) = D_{K L} (P ‖ Q_{θ}) + \int_{s u p p P} (\log \frac{d Q_{θ}}{d Q}) d P .

By Gibbs' inequality we have $D_{K L} (P ‖ Q_{θ}) \geq 0$ so that

D_{K L} (P ‖ Q) \geq \int_{s u p p P} (\log \frac{d Q_{θ}}{d Q}) d P = \int_{s u p p P} (\log \frac{e^{θ x}}{M_{Q} (θ)}) P (d x)

Simplifying the right side, we have, for every real θ where $M_{Q} (θ) < \infty :$

D_{K L} (P ‖ Q) \geq μ'_{1} (P) θ - Ψ_{Q} (θ),

where $μ'_{1} (P)$ is the first moment, or mean, of P, and $Ψ_{Q} = \log M_{Q}$ is called the cumulant-generating function. Taking the supremum completes the process of convex conjugation and yields the rate function:

D_{K L} (P ‖ Q) \geq \sup_{θ} {μ'_{1} (P) θ - Ψ_{Q} (θ)} = Ψ_{Q}^{*} (μ'_{1} (P)) .

Corollary: the Cramér–Rao bound

Mining Engineer (Excluding Oil ) Truman from Alma, loves to spend time knotting, largest property developers in singapore developers in singapore and stamp collecting. Recently had a family visit to Urnes Stave Church.

Start with Kullback's inequality

Let X_θ be a family of probability distributions on the real line indexed by the real parameter θ, and satisfying certain regularity conditions. Then

\lim_{h \to 0} \frac{D_{K L} (X_{θ + h} ‖ X_{θ})}{h^{2}} \geq \lim_{h \to 0} \frac{Ψ_{θ}^{*} (μ_{θ + h})}{h^{2}},

where $Ψ_{θ}^{*}$ is the convex conjugate of the cumulant-generating function of $X_{θ}$ and $μ_{θ + h}$ is the first moment of $X_{θ + h} .$

Left side

The left side of this inequality can be simplified as follows:

\lim_{h \to 0} \frac{D_{K L} (X_{θ + h} ‖ X_{θ})}{h^{2}} = \lim_{h \to 0} \frac{1}{h^{2}} \int_{- \infty}^{\infty} (\log \frac{d X_{θ + h}}{d X_{θ}}) d X_{θ + h}

= \lim_{h \to 0} \frac{1}{h^{2}} \int_{- \infty}^{\infty} [(1 - \frac{d X_{θ}}{d X_{θ + h}}) + \frac{1}{2} {(1 - \frac{d X_{θ}}{d X_{θ + h}})}^{2} + o ({(1 - \frac{d X_{θ}}{d X_{θ + h}})}^{2})] d X_{θ + h},

where we have expanded the logarithm

\log x

in a Taylor series in

1 - 1 / x

,

= \lim_{h \to 0} \frac{1}{h^{2}} \int_{- \infty}^{\infty} [\frac{1}{2} {(1 - \frac{d X_{θ}}{d X_{θ + h}})}^{2}] d X_{θ + h}

= \lim_{h \to 0} \frac{1}{h^{2}} \int_{- \infty}^{\infty} [\frac{1}{2} {(\frac{d X_{θ + h} - d X_{θ}}{d X_{θ + h}})}^{2}] d X_{θ + h} = \frac{1}{2} ℐ_{X} (θ),

which is half the Fisher information of the parameter θ.

Right side

The right side of the inequality can be developed as follows:

\lim_{h \to 0} \frac{Ψ_{θ}^{*} (μ_{θ + h})}{h^{2}} = \lim_{h \to 0} \frac{1}{h^{2}} \sup_{t} {μ_{θ + h} t - Ψ_{θ} (t)} .

This supremum is attained at a value of t=τ where the first derivative of the cumulant-generating function is $Ψ'_{θ} (τ) = μ_{θ + h},$ but we have $Ψ'_{θ} (0) = μ_{θ},$ so that

Ψ^{'}'_{θ} (0) = \frac{d μ_{θ}}{d θ} \lim_{h \to 0} \frac{h}{τ} .

Moreover,

\lim_{h \to 0} \frac{Ψ_{θ}^{*} (μ_{θ + h})}{h^{2}} = \frac{1}{2 Ψ^{'}'_{θ} (0)} {(\frac{d μ_{θ}}{d θ})}^{2} = \frac{1}{2 V a r (X_{θ})} {(\frac{d μ_{θ}}{d θ})}^{2} .

Putting both sides back together

We have:

\frac{1}{2} ℐ_{X} (θ) \geq \frac{1}{2 V a r (X_{θ})} {(\frac{d μ_{θ}}{d θ})}^{2},

which can be rearranged as:

V a r (X_{θ}) \geq \frac{(d μ_{θ} / d θ)^{2}}{ℐ_{X} (θ)} .

Notes and references

↑ Aimé Fuchs and Giorgio Letta, L'inégalité de Kullback. Application à la théorie de l'estimation. Séminaire de probabilités (Strasbourg), vol. 4, pp. 108-131, 1970. http://www.numdam.org/item?id=SPS_1970__4__108_0

[1] Aimé Fuchs and Giorgio Letta, L'inégalité de Kullback. Application à la théorie de l'estimation. Séminaire de probabilités (Strasbourg), vol. 4, pp. 108-131, 1970. http://www.numdam.org/item?id=SPS_1970__4__108_0

[1]

Topos

Contents

Proof

Corollary: the Cramér–Rao bound

Start with Kullback's inequality

Left side

Right side

Putting both sides back together

See also

Notes and references

Navigation menu

Topos

Proof

Corollary: the Cramér–Rao bound

Start with Kullback's inequality

Left side

Right side

Putting both sides back together

See also

Notes and references

Navigation menu

Search