Mason equation: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Saehry
m Moved from "Thermodynamics" category to subcategory.
 
en>Colonies Chris
Equation: sp, date & link fixes; unlinking common words, replaced: Clausius-Clapeyron → Clausius–Clapeyron using AWB
 
Line 1: Line 1:
{{Unreferenced|date=December 2006}}
Jayson Berryhill is how I'm called and my spouse doesn't like it at all. Credit authorising is where my main income arrives from. I am truly fond of handwriting but I can't make it my profession really. I've usually cherished residing in Mississippi.<br><br>Feel free to surf to my web blog ... good psychic - [http://breenq.com/index.php?do=/profile-1144/info/ visit the up coming site],
'''[[Stein's example]]''' is an important result in [[decision theory]] which can be stated as
 
: ''The ordinary decision rule for estimating the mean of a multivariate Gaussian distribution is inadmissible under mean squared error risk in dimension at least 3''.
 
The following is an outline of its proof. The reader is referred to the [[Stein's example|main article]] for more information.
 
==Sketched proof==
The [[risk function]] of the decision rule <math>d(\mathbf{x}) = \mathbf{x}</math> is
 
:<math>R(\theta,d) = \mathbb{E}_\theta[ |\mathbf{\theta - X}|^2]</math>
 
::<math>=\int (\mathbf{\theta - x})^T(\mathbf{\theta - x}) \left( \frac{1}{2\pi} \right)^{n/2} e^{(-1/2) (\mathbf{\theta - x})^T (\mathbf{\theta - x}) } m(dx)</math>
 
::<math> = n.\,</math>
 
Now consider the decision rule
 
:<math>d'(\mathbf{x}) = \mathbf{x} - \frac{\alpha}{|\mathbf{x}|^2}\mathbf{x}</math>
 
where <math>\alpha = n-2</math>. We will show that <math>d'</math> is a better decision rule than <math>d</math>. The risk function is
 
:<math>R(\theta,d') = \mathbb{E}_\theta\left[ \left|\mathbf{\theta - X} + \frac{\alpha}{|\mathbf{X}|^2}\mathbf{X}\right|^2\right]</math>
 
::<math> = \mathbb{E}_\theta\left[ |\mathbf{\theta - X}|^2 + 2(\mathbf{\theta - X})^T\frac{\alpha}{|\mathbf{X}|^2}\mathbf{X} + \frac{\alpha^2}{|\mathbf{X}|^4}|\mathbf{X}|^2 \right] </math>
 
::<math> = \mathbb{E}_\theta\left[ |\mathbf{\theta - X}|^2 \right] + 2\alpha\mathbb{E}_\theta\left[\frac{\mathbf{(\theta-X)^T X}}{|\mathbf{X}|^2}\right] + \alpha^2\mathbb{E}_\theta\left[\frac{1}{|\mathbf{X}|^2} \right]</math>
 
&mdash; a quadratic in <math>\alpha</math>. We may simplify the middle term by considering a general "well-behaved" function <math>h:\mathbf{x} \mapsto h(\mathbf{x}) \in \mathbb{R} </math> and using [[integration by parts]]. For <math>1\leq i \leq n</math>, for any continuously differentiable <math>h</math> growing sufficiently slowly for large <math>x_i</math> we have:
 
::<math>\mathbb{E}_\theta [ (\theta_i - X_i) h(\mathbf{X}) | X_j=x_j (j\neq i) ]= \int (\theta_i - x_i) h(\mathbf{x}) \left( \frac{1}{2\pi} \right)^{n/2} e^{ -(1/2)\mathbf{(x-\theta)}^T \mathbf{(x-\theta)} } m(dx_i)</math>
 
:<math>= \left[ h(\mathbf{x}) \left( \frac{1}{2\pi} \right)^{n/2} e^{-(1/2) \mathbf{(x-\theta)}^T \mathbf{(x-\theta)} } \right]^\infty_{x_i=-\infty} 
- \int  \frac{\partial h}{\partial x_i}(\mathbf{x}) \left( \frac{1}{2\pi} \right)^{n/2} e^{-(1/2)\mathbf{(x-\theta)}^T \mathbf{(x-\theta)} } m(dx_i)</math>
 
:<math> = - \mathbb{E}_\theta \left[ \frac{\partial h}{\partial x_i}(\mathbf{X})  | X_j=x_j (j\neq i) \right].
</math>
 
Therefore,
 
:<math>\mathbb{E}_\theta [ (\theta_i - X_i) h(\mathbf{X})]=  - \mathbb{E}_\theta \left[ \frac{\partial h}{\partial x_i}(\mathbf{X}) \right].</math>
 
(This result is known as [[Stein's lemma]].)
 
Now, we choose
 
:<math>
h(\mathbf{x})  =  \frac{x_i}{|\mathbf{x}|^2}.
</math>
 
If <math>h</math> met the "well-behaved" condition (it doesn't, but this can be remedied -- see below), we would have
 
:<math>\frac{\partial h}{\partial x_i} = \frac{1}{|\mathbf{x}|^2} - \frac{2 x_i^2}{|\mathbf{x}|^4} </math>
 
and so
 
::<math>
\mathbb{E}_\theta\left[\frac{\mathbf{(\theta-X)^T X}}{|\mathbf{X}|^2}\right] = \sum_{i=1}^n \mathbb{E}_\theta \left[ (\theta_i - X_i) \frac{X_i}{|\mathbf{X}|^2} \right]</math>
 
:<math> = - \sum_{i=1}^n \mathbb{E}_\theta \left[ \frac{1}{|\mathbf{X}|^2} - \frac{2 X_i^2}{|\mathbf{X}|^4} \right]</math>
 
:<math> = -(n-2)\mathbb{E}_\theta \left[\frac{1}{|\mathbf{X}|^2}\right].</math>
 
Then returning to the risk function of <math>d'</math> :
 
:<math>
R(\theta,d') =  n - 2\alpha(n-2)\mathbb{E}_\theta\left[\frac{1}{|\mathbf{X}|^2}\right] + \alpha^2\mathbb{E}_\theta\left[\frac{1}{|\mathbf{X}|^2} \right].
</math>
 
This quadratic in <math>\alpha</math> is minimized at
 
:<math>\alpha = n-2,\,</math>
 
giving
 
:<math>R(\theta,d') = R(\theta,d) - (n-2)^2\mathbb{E}_\theta\left[\frac{1}{|\mathbf{X}|^2} \right]</math>
 
which of course satisfies:
 
:<math>
R(\theta,d') < R(\theta,d).
</math>
 
making <math>d</math> an inadmissible decision rule.
 
It remains to justify the use of
:<math>
h(\mathbf{X})= \frac{\mathbf{X}}{|\mathbf{X}|^2}.
</math>
 
This function is not continuously differentiable since it is singular at <math>\mathbf{x}=0</math>. However the function
 
:<math>
h(\mathbf{X}) = \frac{\mathbf{X}}{\epsilon + |\mathbf{X}|^2}
</math>
 
is continuously differentiable, and after following the algebra through and letting <math>\epsilon \to 0</math> one obtains the same result.
 
{{DEFAULTSORT:Stein's Example, Proof Of}}
[[Category:Article proofs]]
[[Category:Decision theory]]
[[Category:Mathematical examples]]
[[Category:Statistical paradoxes]]

Latest revision as of 13:28, 9 March 2014

Jayson Berryhill is how I'm called and my spouse doesn't like it at all. Credit authorising is where my main income arrives from. I am truly fond of handwriting but I can't make it my profession really. I've usually cherished residing in Mississippi.

Feel free to surf to my web blog ... good psychic - visit the up coming site,