Papyrus 110: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Leszek Jańczuk
link GA
en>Dexbot
m Removing Link GA template (handled by wikidata)
 
Line 1: Line 1:
{{Refimprove|date=May 2009}}
Handbags is one of the very most functional accessories.<br><br>Girls choose plenty of effort selecting their [http://www.Sharkbayte.com/keyword/pocketbook pocketbook] to match their ensembles. Additionally they take much time to choose the pocket book together with the colour of their outfit. For folks who do believe in taking staff that is an excessive amount of with them, there are vibrant and attractive coin purses and clutch bags to choose from.<br><br>In contrast to a pocketbook which has straps clutch bags don't have some handles or straps. They are  [http://tinyurl.com/nqbkz6z http://tinyurl.com/nqbkz6z] primarily used as night bags to take make-up and only a few trinkets and are taken under the arm. Many individuals use pocket books to carry lots of items like pocketbooks, coin purses, appointment guides, makeup along with other items that are personal.<br>Organizer bags are perfect for this goal as these bags feature lots of convenient pockets and compartments. Handbags which feature a top, a flat-bottom and also a top close carry handle are satchels. The side profile of a satchel is generally triangular. Watch bracelet on the other hand is a tiny wallet or a clutch which is attached into a small loop that may go around your wrist.<br><br>A barrel bag however is cylindrical in form as its name suggests. You'll find exciting purses that are not wide and resemble a French bread loaf. These  [http://tinyurl.com/nqbkz6z discount ugg boots] bags are named as Baguettes. Perhaps you are seeking an [http://Www.google.co.uk/search?hl=en&gl=us&tbm=nws&q=exceptional&gs_l=news exceptional] and environmentally friendly straw or an elegant leather satchel or cloth bag or perhaps a  [http://tinyurl.com/nqbkz6z http://tinyurl.com/nqbkz6z] sequined bag that is fashionable and modern. You will find many handbag fashions and choose from them to fit your personality and add charming to your own ensembles.<br><br>Backpack handbag is a bigger bag which is supposed to be taken like a backpack. This tote design features two straps that you can either sling the bag on one or both sides of your shoulder. A purse that is classy brings to  [http://tinyurl.com/nqbkz6z ugg boots] a fantastic extent to nature and your style.<br>The purse style you choose from hundreds and thousands of designs and colors can be chosen by you. Each bag style comes with its attributes that are unique. Stores selling handbags, whether online or offline, feature a number of handbag designs from leading designer brands and local labels.<br><br>One of the fascinating handbag styles is the Hobo. Hobos are shoulder bags with a lengthy shoulder strap. They are scooped in the heart and are slouchy to take a look at. Evening clutch bags made from hard circumstance and adorned with sequins that are appealing, crystals and beads are identified as Minaudiere.<br>For individuals that aren't fond of pocketbooks dangling from their shoulders and also want their fingers to be free, cross body bags are the perfect selection. These bags can be worn over the body leaving the hands-free. Ideal Fanny pack also known as the waist pack  [http://tinyurl.com/nqbkz6z ugg outlet] for those who take part in outdoor activities including running and trekking, is the ideal choice.<br><br>This bag could be worn on the waist and is quite useful for carrying little, essential items. There are fabrics, designs, many attractive hand bag styles and colours accessible the marketplace.  [http://tinyurl.com/nqbkz6z ugg boots] Purses are certainly one of the accessories that are best and you can have a collection of them to be employed for different events and functions.
 
Beliefs depend on the available information. This idea is formalized in [[probability theory]] by '''conditioning'''. Conditional [[probability|probabilities]], conditional [[Expected value|expectations]] and conditional [[Probability distribution|distributions]] are treated on three levels: [[Discrete probability distribution|discrete probabilities]], [[probability density function]]s, and [[measure theory]]. Conditioning leads to a non-random result if the condition is completely specified; otherwise, if the condition is left random, the result of conditioning is also random.
 
This article concentrates on interrelations between various kinds of conditioning, shown mostly by examples. For systematic treatment (and corresponding literature) see more specialized articles mentioned below.
 
==Conditioning on the discrete level==
'''Example.''' A fair coin is tossed 10 times; the [[random variable]] ''X'' is the number of heads in these 10 tosses, and ''Y'' — the number of heads in the first 3 tosses. In spite of the fact that ''Y'' emerges before ''X'' it may happen that someone knows ''X'' but not ''Y''.
 
===Conditional probability===
{{Main|Conditional probability}}
Given that ''X'' = 1, the conditional probability of the event ''Y'' = 0 is {{nowrap begin}}P(''Y'' = 0 | ''X'' = 1) = P(''Y'' = 0, ''X'' = 1) / P(''X'' = 1) = 0.7.{{nowrap end}} More generally,
 
: <math> \mathbb{P} (Y=0|X=x) = \frac{ \binom 7 x }{ \binom{10} x } = \frac{ 7! (10-x)! }{ (7-x)! 10! } </math>
 
for ''x'' = 0, 1, 2, 3, 4, 5, 6, 7; otherwise (for ''x'' = 8, 9, 10), {{nowrap begin}}P ( ''Y'' = 0 | ''X'' = ''x'' ) = 0.{{nowrap end}} One may also treat the conditional probability as a random variable, — a function of the random variable ''X'', namely,
 
: <math> \mathbb{P} (Y=0|X) = \begin{cases}
\binom 7 X / \binom{10}X &\text{for } X \le 7,\\
0 &\text{for } X > 7.
\end{cases} </math>
 
The [[expected value|expectation]] of this random variable is equal to the (unconditional) probability,
 
: <math> \mathbb{E} ( \mathbb{P} (Y=0|X) ) = \sum_x \mathbb{P} (Y=0|X=x) \mathbb{P} (X=x) = \mathbb{P} (Y=0), </math>
 
namely,
 
: <math>\sum_{x=0}^7 \frac{ \binom 7 x }{ \binom{10}x } \cdot \frac1{2^{10}} \binom{10}x = \frac 1 8 , </math>
 
which is an instance of the [[law of total probability]] {{nowrap begin}}E ( P ( ''A'' | ''X'' ) ) = P ( ''A'' ).{{nowrap end}}
 
Thus, {{nowrap begin}}P ( ''Y'' = 0 | ''X'' = 1 ){{nowrap end}} may be treated as the value of the random variable {{nowrap begin}}P ( ''Y'' = 0 | ''X'' ){{nowrap end}} corresponding to ''X'' = 1. <cite id="EP8">On the other hand, {{nowrap begin}}P(''Y'' = 0 | ''X'' = 1){{nowrap end}} is well-defined irrespective of other possible values of ''X''.</cite>
 
===Conditional expectation===
{{Main|Conditional expectation}}
Given that ''X'' = 1, the conditional expectation of the random variable ''Y'' is {{nowrap begin}}E ( ''Y'' | ''X'' = 1 ) = 0.3.{{nowrap end}} More generally,
 
: <math> \mathbb{E} (Y|X=x) = \frac3{10} x </math>
 
for ''x'' = 0, ..., 10. (In this example it appears to be a linear function, but in general it is nonlinear.) One may also treat the conditional expectation as a random variable, — a function of the random variable ''X'', namely,
 
: <math> \mathbb{E} (Y|X) = \frac3{10} X. </math>
 
The expectation of this random variable is equal to the (unconditional) expectation of ''Y'',
 
: <math> \mathbb{E} ( \mathbb{E} (Y|X) ) = \sum_x \mathbb{E} (Y|X=x) \mathbb{P} (X=x) = \mathbb{E} (Y), </math>
 
namely,
 
: <math>\sum_{x=0}^{10} \tfrac{3}{10} x \cdot \tfrac1{2^{10}} \binom{10}x = \tfrac 3 2,</math>
 
or simply
 
:<math> \mathbb{E} \left( \tfrac{3}{10} X \right) = \tfrac{3}{10} \mathbb{E} (X) = \tfrac{3}{10} \cdot 5 = \tfrac32,</math>
 
which is an instance of the [[law of total expectation]] {{nowrap begin}}E ( E ( ''Y'' | ''X'' ) ) = E ( ''Y'' ).{{nowrap end}}
 
The random variable {{nowrap begin}}E(''Y'' | ''X''){{nowrap end}} is the best predictor of ''Y'' given ''X''. That is, it minimizes the mean square error {{nowrap begin}}E ( ''Y'' - ''f''(''X'') )<sup>2</sup>{{nowrap end}} on the class of all random variables of the form ''f''(''X''). This class of random variables remains intact if ''X'' is replaced, say, with 2''X''. Thus, {{nowrap begin}}E ( ''Y'' | 2''X'' ) = E ( ''Y'' | ''X'' ).{{nowrap end}} It does not mean that {{nowrap begin}}E (''Y'' | 2''X'' ) = 0.3 × 2''X'';{{nowrap end}} rather, {{nowrap begin}}E ( ''Y'' | 2''X'' ) = 0.15 × 2''X'' = 0.3 ''X''.{{nowrap end}} In particular, {{nowrap begin}}E (''Y'' | 2''X''=2) = 0.3.{{nowrap end}} More generally, {{nowrap begin}}E (''Y'' | ''g''(''X'')) = E ( ''Y'' | ''X'' ){{nowrap end}} for every function ''g'' that is one-to-one on the set of all possible values of ''X''. The values of ''X'' are irrelevant; what matters is the partition (denote it α<sub>''X''</sub>)
: <math> \Omega = \{ X=x_1 \} \uplus \{ X=x_2 \} \uplus \dots </math>
of the sample space Ω into disjoint sets {''X'' = ''x<sub>n</sub>''}. (Here <math> x_1, x_2, \dots </math> are all possible values of ''X''.) Given an arbitrary partition α of Ω, one may define the random variable {{nowrap begin}}E ( ''Y'' | α ).{{nowrap end}} Still, {{nowrap begin}}E ( E ( ''Y'' | α)) = E ( ''Y'' ).{{nowrap end}}
 
Conditional probability may be treated as a special case of conditional expectation. Namely, {{nowrap begin}}P ( ''A'' | ''X'' ) = E ( ''Y'' | ''X'' ){{nowrap end}} if ''Y'' is the [[indicator function|indicator]] of ''A''. Therefore the conditional probability also depends on the partition α<sub>''X''</sub> generated by ''X'' rather than on ''X'' itself; {{nowrap begin}}P ( ''A'' | ''g''(''X'') ) = P (''A'' | ''X'') = P (''A'' | α),{{nowrap end}} {{nowrap begin}}α = α<sub>''X''</sub> = α<sub>''g''(''X'')</sub>.{{nowrap end}}
 
On the other hand, conditioning on an event ''B'' is well-defined, provided that {{nowrap begin}}P (''B'') ≠ 0,{{nowrap end}} irrespective of any partition that may contain ''B'' as one of several parts.
 
===Conditional distribution===
{{main|Conditional probability distribution}}
Given ''X'' = x, the conditional distribution of ''Y'' is
 
: <math> \mathbb{P} ( Y=y | X=x ) = \frac{ \binom 3 y \binom 7 {x-y} }{ \binom{10}x } = \frac{ \binom x y \binom{10-x}{3-y} }{ \binom{10}3 } </math>
 
for {{nowrap begin}}0 ≤ ''y'' ≤ min ( 3, ''x'' ).{{nowrap end}} It is the [[hypergeometric distribution]] {{nowrap begin}}H ( ''x''; 3, 7 ),{{nowrap end}} or equivalently, {{nowrap begin}}H ( 3; ''x'', 10-''x'' ).{{nowrap end}} The corresponding expectation 0.3 ''x'', obtained from the general formula
 
:<math> n \frac{R}{R+W} </math>
 
for {{nowrap begin}}H ( ''n''; ''R'', ''W'' ),{{nowrap end}} is nothing but the conditional expectation {{nowrap begin}}E (''Y'' | ''X'' = ''x'') = 0.3 ''x''.{{nowrap end}}
 
Treating {{nowrap begin}}H ( ''X''; 3, 7 ){{nowrap end}} as a random distribution (a random vector in the four-dimensional space of all measures on {0,1,2,3}), one may take its expectation, getting the unconditional distribution of ''Y'', — the [[binomial distribution]] {{nowrap begin}}Bin ( 3, 0.5 ).{{nowrap end}} This fact amounts to the equality
 
: <math> \sum_{x=0}^{10} \mathbb{P} ( Y=y | X=x ) \mathbb{P} (X=x) = \mathbb{P} (Y=y) = \frac1{2^3} \binom 3 y </math>
 
for ''y'' = 0,1,2,3; just the law of total probability.
 
==Conditioning on the level of densities==
{{main|Probability density function|Conditional probability distribution}}
'''Example.''' A point of the sphere ''x''<sup>2</sup> + ''y''<sup>2</sup> + ''z''<sup>2</sup> = 1 is chosen at random according to the uniform distribution on the sphere.<ref>[[n-sphere#Generating points on the surface of the n-ball]]</ref><ref>[http://en.wikibooks.org/wiki/Mathematica/Uniform_Spherical_Distribution wikibooks: Uniform Spherical Distribution]</ref> The random variables ''X'', ''Y'', ''Z'' are the coordinates of the random point. The joint density of ''X'', ''Y'', ''Z'' does not exist (since the sphere is of zero volume), but the joint density ''f''<sub>''X'',''Y''</sub> of ''X'', ''Y'' exists,
 
: <math> f_{X,Y} (x,y) = \begin{cases}
  \frac1{2\pi\sqrt{1-x^2-y^2}} &\text{if } x^2+y^2<1,\\
  0 &\text{otherwise}.
\end{cases} </math>
 
(The density is non-constant because of a non-constant angle between the sphere and the plane.<ref>[[Area#General formula]]</ref>) The density of ''X'' may be calculated by integration,
 
: <math> f_X(x) = \int_{-\infty}^{+\infty} f_{X,Y}(x,y) \, \mathrm{d}y = \int_{-\sqrt{1-x^2}}^{+\sqrt{1-x^2}} \frac{ \mathrm{d}y }{ 2\pi\sqrt{1-x^2-y^2} } \, ; </math>
 
surprisingly, the result does not depend on ''x'' in (−1,1),
 
: <math> f_X(x) = \begin{cases}
0.5 &\text{for } -1<x<1,\\
0 &\text{otherwise},
\end{cases} </math>
 
which means that ''X'' is distributed uniformly on (−1,1). The same holds for ''Y'' and ''Z'' (and in fact, for {{nowrap begin}}''aX'' + ''bY'' + ''cZ''{{nowrap end}} whenever {{nowrap begin}}''a''<sup>2</sup> + b<sup>2</sup> + c<sup>2</sup> = 1).{{nowrap end}}
 
===Conditional probability===
 
====Calculation====
Given that ''X'' = 0.5, the conditional probability of the event ''Y'' ≤ 0.75 is the integral of the conditional density,
: <math> f_{Y|X=0.5}(y) = \frac{ f_{X,Y}(0.5,y) }{ f_X(0.5) } = \begin{cases}
\frac{1}{ \pi \sqrt{0.75-y^2} } &\text{for } -\sqrt{0.75}<y<\sqrt{0.75},\\
0 &\text{otherwise}.
\end{cases} </math>
:<math>\mathbb{P} (Y \le 0.75|X=0.5) = \int_{-\infty}^{0.75} f_{Y|X=0.5}(y) \, \mathrm{d}y = \int_{-\sqrt{0.75}}^{0.75} \frac{\mathrm{d}y}{\pi \sqrt{0.75-y^2} } = \tfrac12 + \tfrac1{\pi} \arcsin \sqrt{0.75} = \tfrac56.</math>
More generally,
: <math> \mathbb{P} (Y \le y|X=x) = \tfrac12 + \tfrac1{\pi} \arcsin \frac{ y }{ \sqrt{1-x^2} } </math>
for all ''x'' and ''y'' such that −1 < ''x'' < 1 (otherwise the denominator ''f''<sub>''X''</sub>(''x'') vanishes) and <math>\textstyle -\sqrt{1-x^2} < y < \sqrt{1-x^2} </math> (otherwise the conditional probability degenerates to 0 or 1). One may also treat the conditional probability as a random variable, — a function of the random variable ''X'', namely,
: <math> \mathbb{P} (Y \le y|X) = \begin{cases}
0 &\text{for } X^2 \ge 1-y^2 \text{ and } y<0,\\
\frac12 + \frac1{\pi} \arcsin \frac{ y }{ \sqrt{1-X^2} } &\text{for } X^2 < 1-y^2,\\
1 &\text{for } X^2 \ge 1-y^2 \text{ and } y>0.
\end{cases} </math>
The expectation of this random variable is equal to the (unconditional) probability,
: <cite id="DPC8"> <math> \mathbb{E} ( \mathbb{P} (Y\le y|X) ) = \int_{-\infty}^{+\infty} \mathbb{P} (Y\le y|X=x) f_X(x) \, \mathrm{d}x = \mathbb{P} (Y\le y), </math> </cite>
which is an instance of the [[law of total probability]] {{nowrap begin}}E ( P ( ''A'' | ''X'' ) ) = P ( ''A'' ).{{nowrap end}}
 
====Interpretation====
The conditional probability {{nowrap begin}}P ( ''Y'' ≤ 0.75 | ''X'' = 0.5 ){{nowrap end}} cannot be interpreted as {{nowrap begin}}P ( ''Y'' ≤ 0.75, ''X'' = 0.5 ) / P ( ''X'' = 0.5 ),{{nowrap end}} since the latter gives 0/0. Accordingly, {{nowrap begin}}P ( ''Y'' ≤ 0.75 | ''X'' = 0.5 ){{nowrap end}} cannot be interpreted via empirical frequencies, since the exact value ''X'' = 0.5 has no chance to appear at random, not even once during an infinite sequence of independent trials.
 
The conditional probability can be interpreted as a limit,
: <cite id="DPI5"> <math> \begin{align}
\mathbb{P} (Y\le0.75 | X=0.5) &= \lim_{\varepsilon\to0+} \mathbb{P} (Y\le0.75 | 0.5-\varepsilon<X<0.5+\varepsilon) \\
& = \lim_{\varepsilon\to0+} \frac{ \mathbb{P} (Y\le0.75, 0.5-\varepsilon<X<0.5+\varepsilon) }{ \mathbb{P} (0.5-\varepsilon<X<0.5+\varepsilon) } \\
& = \lim_{\varepsilon\to0+} \frac{ \int_{0.5-\varepsilon}^{0.5+\varepsilon} \mathrm{d}x \int_{-\infty}^{0.75} \mathrm{d}y \, f_{X,Y}(x,y) }{ \int_{0.5-\varepsilon}^{0.5+\varepsilon} \mathrm{d}x \, f_X(x)}.
\end{align} </math> </cite>
 
===Conditional expectation===
The conditional expectation {{nowrap begin}}E ( ''Y'' | ''X'' = 0.5 ){{nowrap end}} is of little interest; it vanishes just by symmetry. It is more interesting to calculate {{nowrap begin}}E ( |''Z''| | ''X'' = 0.5 ){{nowrap end}} treating |''Z''| as a function of ''X'', ''Y'':
: <math> \begin{align}
|Z| &= h(X,Y) = \sqrt{1-X^2-Y^2}; \\
\mathrm{E} ( |Z| | X=0.5 ) &= \int_{-\infty}^{+\infty} h(0.5,y) f_{Y|X=0.5} (y) \, \mathrm{d} y = \\
& = \int_{-\sqrt{0.75}}^{+\sqrt{0.75}} \sqrt{0.75-y^2}  \cdot \frac{ \mathrm{d}y }{ \pi \sqrt{0.75-y^2} } \\
&= \frac2\pi \sqrt{0.75} .
\end{align} </math>
More generally,
: <math> \mathbb{E} ( |Z| | X=x ) = \frac2\pi \sqrt{1-x^2} </math>
for −1 < ''x'' < 1. One may also treat the conditional expectation as a random variable, — a function of the random variable ''X'', namely,
: <math> \mathbb{E} ( |Z| | X ) = \frac2\pi \sqrt{1-X^2}. </math>
The expectation of this random variable is equal to the (unconditional) expectation of |''Z''|,
: <math> \mathbb{E} ( \mathbb{E} ( |Z| | X ) ) = \int_{-\infty}^{+\infty} \mathbb{E} ( |Z| | X=x ) f_X(x) \, \mathrm{d}x = \mathbb{E} (|Z|), </math>
namely,
: <math> \int_{-1}^{+1} \frac2\pi \sqrt{1-x^2} \cdot \frac{ \mathrm{d}x }2 = \tfrac{1}{2}, </math>
which is an instance of the [[law of total expectation]] {{nowrap begin}}E ( E ( ''Y'' | ''X'' ) ) = E ( ''Y'' ).{{nowrap end}}
 
The random variable {{nowrap begin}}E(|''Z''| | ''X''){{nowrap end}} is the best predictor of |''Z''| given ''X''. That is, it minimizes the mean square error {{nowrap begin}}E ( |''Z''| - ''f''(''X'') )<sup>2</sup>{{nowrap end}} on the class of all random variables of the form ''f''(''X''). Similarly to the discrete case, {{nowrap begin}}E ( |''Z''| | ''g''(''X'') ) = E ( |''Z''| | ''X'' ){{nowrap end}} for every measurable function ''g'' that is one-to-one on (-1,1).
 
===Conditional distribution===
Given ''X'' = x, the conditional distribution of ''Y'', given by the density ''f''<sub>''Y''|''X''=''x''</sub>(y), is the (rescaled) arcsin distribution; its cumulative distribution function is
: <math> F_{Y|X=x} (y) = \mathbb{P} ( Y \le y | X = x ) = \frac12 + \frac1\pi \arcsin \frac{y}{\sqrt{1-x^2}} </math>
for all ''x'' and ''y'' such that ''x''<sup>2</sup> + ''y''<sup>2</sup> < 1.The corresponding expectation of ''h''(''x'',''Y'') is nothing but the conditional expectation {{nowrap begin}}E ( ''h''(''X'',''Y'') | ''X''=''x'' ).{{nowrap end}} The [[Mixture density|mixture]] of these conditional distributions, taken for all ''x'' (according to the distribution of ''X'') is the unconditional distribution of ''Y''. This fact amounts to the equalities
: <math> \begin{align}
& \int_{-\infty}^{+\infty} f_{Y|X=x} (y) f_X(x) \, \mathrm{d}x = f_Y(y), \\
& \int_{-\infty}^{+\infty} F_{Y|X=x} (y) f_X(x) \, \mathrm{d}x = F_Y(y),
\end{align} </math>
the latter being the instance of the law of total probability [[#DPC8|mentioned above]].
 
==What conditioning is not==
{{main|Borel–Kolmogorov paradox}}
On the discrete level conditioning is possible only if the condition is of nonzero probability (one cannot divide by zero). On the level of densities, conditioning on ''X'' = ''x'' is possible even though {{nowrap begin}}P ( ''X'' = ''x'' ) = 0.{{nowrap end}} This success may create the illusion that conditioning is ''always'' possible. Regretfully, it is not, for several reasons presented below.
 
===Geometric intuition: caution===
The result {{nowrap begin}}P ( ''Y'' ≤ 0.75 | ''X'' = 0.5 ) = 5/6,{{nowrap end}} mentioned above, is geometrically evident in the following sense. The points (''x'',''y'',''z'') of the sphere ''x''<sup>2</sup> + ''y''<sup>2</sup> + ''z''<sup>2</sup> = 1, satisfying the condition ''x'' = 0.5, are a circle ''y''<sup>2</sup> + ''z''<sup>2</sup> = 0.75 of radius <math> \sqrt{0.75} </math> on the plane ''x'' = 0.5. The inequality ''y'' ≤ 0.75 holds on an arc. The length of the arc is 5/6 of the length of the circle, which is why the conditional probability is equal to 5/6.
 
This successful geometric explanation may create the illusion that the following question is trivial.
 
: A point of a given sphere is chosen at random (uniformly). Given that the point lies on a given plane, what is its conditional distribution?
 
It may seem evident that the conditional distribution must be uniform on the given circle (the intersection of the given sphere and the given plane). Sometimes it really is, but in general it is not. Especially, ''Z'' is distributed uniformly on (-1,+1) and independent of the ratio ''Y''/''X'', thus, {{nowrap begin}}P ( ''Z'' ≤ 0.5 | ''Y''/''X'' ) = 0.75.{{nowrap end}} On the other hand, the inequality ''z'' ≤ 0.5 holds on an arc of the circle {{nowrap begin}}''x''<sup>2</sup> + ''y''<sup>2</sup> + ''z''<sup>2</sup> = 1,{{nowrap end}} {{nowrap begin}}''y'' = ''cx''{{nowrap end}} (for any given ''c''). The length of the arc is 2/3 of the length of the circle. However, the conditional probability is 3/4, not 2/3. This is a manifestation of the classical Borel paradox.<ref>{{harvnb|Pollard|2002|loc=Sect. 5.5, Example 17 on page 122}}</ref><ref>{{harvnb|Durrett|1996|loc=Sect. 4.1(a), Example 1.6 on page 224}}</ref>
 
{{quote|Appeals to symmetry can be misleading if not formalized as invariance arguments.|Pollard<ref name="Pollard-5.5-122">{{harvnb|Pollard|2002|loc=Sect. 5.5, page 122}}</ref>}}
 
Another example. A [[Rotation matrix#Uniform random rotation matrices|random rotation]] of the three-dimensional space is a rotation by a random angle around a random axis. Geometric intuition suggests that the angle is independent of the axis and distributed uniformly. However, the latter is wrong; small values of the angle are less probable.
 
===The limiting procedure===
Given an event ''B'' of zero probability, the formula <math>\textstyle \mathbb{P} (A|B) = \mathbb{P} ( A \cap B ) / \mathbb{P} (B) </math> is useless, however, one can try <math>\textstyle \mathbb{P} (A|B) = \lim_{n\to\infty} \mathbb{P} ( A \cap B_n ) / \mathbb{P} (B_n) </math> for an appropriate sequence of events ''B''<sub>''n''</sub> of nonzero probability such that ''B''<sub>''n''</sub> ↓ ''B'' (that is, <math>\textstyle B_1 \supset B_2 \supset \dots </math> and <math>\textstyle B_1 \cap B_2 \cap \dots = B </math>). One example is given [[#DPI5|above]]. Two more examples are [[Wiener process#Related processes|Brownian bridge and Brownian excursion]].
 
In the latter two examples the law of total probability is irrelevant, since only a single event (the condition) is given. By contrast, in the example [[#DPI5|above]] the law of total probability [[#DPC8|applies]], since the event ''X'' = 0.5 is included into a family of events ''X'' = ''x'' where ''x'' runs over (−1,1), and these events are a partition of the probability space.
 
In order to avoid paradoxes (such as the [[Borel's paradox]]), the following important distinction should be taken into account. If a given event is of nonzero probability then conditioning on it is well-defined (irrespective of any other events), as was noted [[#EP8|above]]. By contrast, if the given event is of zero probability then conditioning on it is ill-defined unless some additional input is provided. Wrong choice of this additional input leads to wrong conditional probabilities (expectations, distributions). In this sense, "''the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible.''" ([[Andrey Kolmogorov|Kolmogorov]]; quoted in <ref name="Pollard-5.5-122"/>).
 
The additional input may be (a) a symmetry (invariance group); (b) a sequence of events ''B''<sub>''n''</sub> such that ''B''<sub>''n''</sub> ↓ ''B'', P ( ''B''<sub>''n''</sub> ) > 0; (c) a partition containing the given event. Measure-theoretic conditioning (below) investigates Case (c), discloses its relation to (b) in general and to (a) when applicable.
 
Some events of zero probability are beyond the reach of conditioning. An example: let ''X''<sub>''n''</sub> be independent random variables distributed uniformly on (0,1), and ''B'' the event {{nowrap begin}}"''X''<sub>''n''</sub> → 0{{nowrap end}} as {{nowrap begin}}''n'' → ∞";{{nowrap end}} what about {{nowrap begin}}P ( ''X''<sub>''n''</sub> < 0.5 | ''B'' ) ?{{nowrap end}} Does it tend to 1, or not? Another example: let ''X'' be a  random variable distributed uniformly on (0,1), and ''B'' the event "''X'' is a rational number"; what about {{nowrap begin}}P ( ''X'' = 1/''n'' | ''B'' ) ?{{nowrap end}} The only answer is that, once again, {{quote|the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible.|Kolmogorov, quoted in <ref name="Pollard-5.5-122">{{harvnb|Pollard|2002|loc=Sect. 5.5, page 122}}.</ref>}}
 
==Conditioning on the level of measure theory==
{{main|Conditional expectation}}
'''Example.''' Let ''Y'' be a random variable distributed uniformly on (0,1), and ''X'' = ''f''(''Y'') where ''f'' is a given function. Two cases are treated below: ''f'' = ''f''<sub>1</sub> and ''f'' = ''f''<sub>2</sub>, where ''f''<sub>1</sub> is the continuous piecewise-linear function
: <math> f_1(y) = \begin{cases}
3y &\text{for } 0 \le y \le 1/3,\\
1.5(1-y) &\text{for } 1/3 \le y \le 2/3,\\
0.5 &\text{for } 2/3 \le y \le 1,
\end{cases} </math>
and ''f''<sub>2</sub> is the [[Weierstrass function]].
 
===Geometric intuition: caution===
Given ''X'' = 0.75, two values of ''Y'' are possible, 0.25 and 0.5. It may seem evident that both values are of conditional probability 0.5 just because one point is [[Congruence (geometry)|congruent]] to another point. However, this is an illusion; see below.
 
===Conditional probability===
The conditional probability {{nowrap begin}}P ( ''Y'' ≤ 1/3 | ''X'' ){{nowrap end}} may be defined as the best predictor of the indicator
: <math> I = \begin{cases}
1 &\text{if } Y \le 1/3,\\
0 &\text{otherwise},
\end{cases} </math>
given ''X''. That is, it minimizes the mean square error {{nowrap begin}}E ( ''I'' - ''g''(''X'') )<sup>2</sup>{{nowrap end}} on the class of all random variables of the form ''g'' (''X'').
 
In the case ''f'' = ''f''<sub>1</sub> the corresponding function ''g'' = ''g''<sub>1</sub> may be calculated explicitly,<ref group="details">
Proof:
 
:<math> \begin{align}
\mathbb{E} ( I - g(X) )^2 & = \int_0^{1/3} (1-g(3y))^2 \, \mathrm{d}y + \int_{1/3}^{2/3} g^2 (1.5(1-y)) \, \mathrm{d}y + \int_{2/3}^1 g^2 (0.5) \, \mathrm{d}y \\
& = \int_0^1 (1-g(x))^2 \frac{ \mathrm{d}x }{ 3 } + \int_{0.5}^1 g^2(x) \frac{ \mathrm{d} x }{ 1.5 } + \frac13 g^2(0.5) \\
& = \frac13 \int_0^{0.5} (1-g(x))^2 \, \mathrm{d}x + \frac13 g^2(0.5) + \frac13 \int_{0.5}^1 ( (1-g(x))^2 + 2g^2(x) ) \, \mathrm{d}x \, ;
\end{align} </math>
it remains to note that {{nowrap begin}}(1−''a'' )<sup>2</sup> + 2''a''<sup>2</sup>{{nowrap end}} is minimal at ''a'' = 1/3.</ref>
 
:<math> g_1(x) = \begin{cases}
1 &\text{for } 0 < x < 0.5,\\
0 &\text{for } x = 0.5,\\
1/3 &\text{for } 0.5 < x < 1.
\end{cases} </math>
 
Alternatively, the limiting procedure may be used,
: <math> g_1(x) = \lim_{\varepsilon\to0+} \mathbb{P} ( Y \le 1/3 | x-\varepsilon \le X \le x+\varepsilon ) \, , </math>
giving the same result.
 
Thus, {{nowrap begin}}P ( ''Y'' ≤ 1/3 | ''X'' ) = ''g''<sub>1</sub> (''X'').{{nowrap end}} The expectation of this random variable is equal to the (unconditional) probability, {{nowrap begin}}E ( P ( ''Y'' ≤ 1/3 | ''X'' ) ) = P ( ''Y'' ≤ 1/3 ),{{nowrap end}} namely,
: <math> 1 \cdot \mathbb{P} (X<0.5) + 0 \cdot \mathbb{P} (X=0.5) + \frac13 \cdot \mathbb{P} (X>0.5) = 1 \cdot \frac16 + 0 \cdot \frac13 + \frac13 \cdot \left( \frac16 + \frac13 \right) = \frac13, </math>
which is an instance of the [[law of total probability]] {{nowrap begin}}E ( P ( ''A'' | ''X'' ) ) = P ( ''A'' ).{{nowrap end}}
 
In the case ''f'' = ''f''<sub>2</sub> the corresponding function ''g'' = ''g''<sub>2</sub> probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically. Indeed, the [[Lp_space#Hilbert_spaces|space]] L<sub>2</sub> (Ω) of all square integrable random variables is a [[Hilbert space]]; the indicator ''I'' is a vector of this space; and random variables of the form ''g'' (''X'') are a (closed, linear) subspace. The [[Hilbert_space#Orthogonal_complements_and_projections|orthogonal projection]] of this vector to this subspace is well-defined. It can be computed numerically, using [[Galerkin method|finite-dimensional approximations]] to the infinite-dimensional Hilbert space.
 
Once again, the expectation of the random variable {{nowrap begin}}P ( ''Y'' ≤ 1/3 | ''X'' ) = ''g''<sub>2</sub> (''X''){{nowrap end}} is equal to the (unconditional) probability, {{nowrap begin}}E ( P ( ''Y'' ≤ 1/3 | ''X'' ) ) = P ( ''Y'' ≤ 1/3 ),{{nowrap end}} namely,
: <math> \int_0^1 g_2 (f_2(y)) \, \mathrm{d}y = \tfrac13. </math>
 
However, the Hilbert space approach treats ''g''<sub>2</sub> as an equivalence class of functions rather than an individual function. Measurability of ''g''<sub>2</sub> is ensured, but continuity (or even [[Riemann integrability]]) is not. The value ''g''<sub>2</sub> (0.5) is determined uniquely, since the point 0.5 is an atom of the distribution of ''X''. Other values ''x'' are not atoms, thus, corresponding values ''g''<sub>2</sub> (''x'') are not determined uniquely. Once again, "''the concept of a conditional probability with regard to an isolated hypothesis whose probability equals 0 is inadmissible.''" ([[Andrey Kolmogorov|Kolmogorov]]; quoted in <ref name="Pollard-5.5-122"/>).
 
Alternatively, the same function ''g'' (be it ''g''<sub>1</sub> or ''g''<sub>2</sub>) may be defined as the [[Radon–Nikodym derivative]]
: <math> g = \frac{ \mathrm{d}\nu }{ \mathrm{d}\mu }, </math>
where measures μ, ν are defined by
: <math> \begin{align}
\mu (B) &= \mathbb{P} ( X \in B ), \\
\nu (B) &= \mathbb{P} ( X \in B, \, Y \le \tfrac{1}{3})
\end{align} </math>
for all Borel sets <math> B \subset \mathbb R. </math> That is, μ is the (unconditional) distribution of ''X'', while ν is one third of its conditional distribution,
: <math> \nu (B) = \mathbb{P} ( X \in B | Y \le \tfrac{1}{3} ) \mathbb{P} ( Y \le \tfrac{1}{3} ) = \tfrac13 \mathbb{P} ( X \in B | Y \le \tfrac{1}{3} ). </math>
 
Both approaches (via the Hilbert space, and via the Radon–Nikodym derivative) treat ''g'' as an equivalence class of functions; two functions ''g'' and ''g′'' are treated as equivalent, if ''g'' (''X'') = ''g′'' (''X'') almost surely. Accordingly, the conditional probability {{nowrap begin}}P ( ''Y'' ≤ 1/3 | ''X'' ){{nowrap end}} is treated as an equivalence class of random variables; as usual, two random variables are treated as equivalent if they are equal almost surely.
 
===Conditional expectation===
The conditional expectation {{nowrap begin}}E ( ''Y'' | ''X'' ){{nowrap end}} may be defined as the best predictor of ''Y'' given ''X''. That is, it minimizes the mean square error {{nowrap begin}}E ( ''Y'' - ''h''(''X'') )<sup>2</sup>{{nowrap end}} on the class of all random variables of the form ''h''(''X'').
 
In the case ''f'' = ''f''<sub>1</sub> the corresponding function ''h'' = ''h''<sub>1</sub> may be calculated explicitly,<ref group="details">
Proof:
 
:<math>\begin{align}
\mathbb{E} ( Y - h_1(X) )^2 &= \int_0^1 \left ( y - h_1 ( f_1(x) ) \right )^2 \, \mathrm{d}y \\
&= \int_0^{\frac{1}{3}} (y-h_1(3y))^2 \, \mathrm{d}y + \int_{\frac{1}{3}}^{\frac{2}{3}} \left( y - h_1( 1.5(1-y) ) \right)^2  \, \mathrm{d}y + \int_{\frac{2}{3}}^1 \Big( y - h_1(\tfrac{1}{2}) \Big)^2 \, \mathrm{d}y \\
&= \int_0^1 \left( \frac x 3 - h_1(x) \right)^2 \frac{ \mathrm{d}x }{3} + \int_{\frac{1}{2}}^1 \left ( 1 - \frac{x}{1.5} - h_1(x) \right)^2 \frac{ \mathrm{d} x }{ 1.5 } + \frac13 h_1^2(\tfrac{1}{2}) - \frac 5 9 h_1(\tfrac{1}{2}) + \frac{19}{81} \\
&= \frac13 \int_0^{\frac{1}{2}} \left( h_1(x) - \frac x 3 \right)^2 \, \mathrm{d}x + \tfrac13 h_1^2(\tfrac{1}{2}) - \tfrac{5}{9} h_1(\tfrac{1}{2}) + \tfrac{19}{81} + \tfrac13 \int_{\frac{1}{2}}^1 \bigg( \Big( h_1(x) - \frac x 3 \Big)^2 + 2 \Big( h_1(x) - 1 + \frac{2x}{3} \Big)^2 \bigg) \, \mathrm{d}x;
\end{align} </math>
 
it remains to note that
:<math>\left (a-\frac x 3 \right )^2 + 2 \left (a-1+\frac{2x}3 \right )^2 </math>
is minimal at <math>a = \frac{2-x}3, </math> and <math>\frac13 a^2 - \frac{5}{9} a </math> is minimal at <math>a = \tfrac 5 6. </math></ref>
: <math> h_1(x) = \begin{cases}
x/3 &\text{for } 0 < x < 0.5,\\
5/6 &\text{for } x = 0.5,\\
(2-x)/3 &\text{for } 0.5 < x < 1,
\end{cases} </math>
 
Alternatively, the limiting procedure may be used,
: <math> h_1(x) = \lim_{\varepsilon\to0+} \mathbb{E} ( Y | x-\varepsilon \le X \le x+\varepsilon ),</math>
giving the same result.
 
Thus, {{nowrap begin}}E ( ''Y'' | ''X'' ) = ''h''<sub>1</sub> (''X'').{{nowrap end}} The expectation of this random variable is equal to the (unconditional) expectation, {{nowrap begin}}E ( E ( ''Y'' | ''X'' ) ) = E ( ''Y'' ),{{nowrap end}} namely,
: <math> \begin{align}
& \int_0^1 h_1(f_1(y)) \, \mathrm{d}y = \int_0^{1/6} \frac{3y}3 \, \mathrm{d}y + \\
& \quad + \int_{1/6}^{1/3} \frac{2-3y}3 \, \mathrm{d}y + \int_{1/3}^{2/3} \frac{ 2 - 1.5(1-y) }{ 3 } \, \mathrm{d}y + \int_{2/3}^1 \frac56 \, \mathrm{d}y = \frac12 \, ,
\end{align} </math>
which is an instance of the [[law of total expectation]] {{nowrap begin}}E ( E ( ''Y'' | ''X'' ) ) = E ( ''Y'' ).{{nowrap end}}
 
In the case ''f'' = ''f''<sub>2</sub> the corresponding function ''h'' = ''h''<sub>2</sub> probably cannot be calculated explicitly. Nevertheless it exists, and can be computed numerically in the same way as ''g''<sub>2</sub> above, — as the orthogonal projection in the Hilbert space. The law of total expectation holds, since the projection cannot change the scalar product by the constant 1 belonging to the subspace.
 
Alternatively, the same function ''h'' (be it ''h''<sub>1</sub> or ''h''<sub>2</sub>) may be defined as the [[Radon–Nikodym derivative]]
: <math> h = \frac{ \mathrm{d}\nu }{ \mathrm{d}\mu } \, , </math>
where measures μ, ν are defined by
: <math> \begin{align}
\mu (B) &= \mathbb{P} ( X \in B ) \, , \\
\nu (B) &= \mathbb{E} ( Y, \, X \in B )
\end{align} </math>
for all Borel sets <math> B \subset \mathbb R.</math> Here {{nowrap begin}}E ( ''Y''; ''A'' ){{nowrap end}}  is the restricted expectation, not to be confused with the conditional expectation {{nowrap begin}}E ( ''Y'' | ''A'' ) = E (''Y''; ''A'' ) / P ( ''A'' ).{{nowrap end}}
 
===Conditional distribution===
{{main|Disintegration theorem|Regular conditional probability}}
In the case ''f'' = ''f''<sub>1</sub> the conditional [[cumulative distribution function]] may be calculated explicitly, similarly to ''g''<sub>1</sub>. The limiting procedure gives
: <math>F_{Y|X=\frac{3}{4}} (y) = \mathbb{P} \left ( Y \le y | X =\tfrac{3}{4} \right ) = \lim_{\varepsilon\to0^+} \mathbb{P} \left ( Y \le y | \tfrac{3}{4}-\varepsilon \le X \le \tfrac{3}{4}+\varepsilon \right ) = \begin{cases}
0 &\text{for } -\infty < y < \tfrac{1}{4},\\
\tfrac{1}{6} &\text{for } y = \tfrac{1}{4},\\
\tfrac{1}{3} &\text{for } \tfrac{1}{4} < y < \tfrac{1}{2},\\
\tfrac{2}{3} &\text{for } y = \tfrac{1}{2},\\
1 &\text{for } \tfrac{1}{2} < y < \infty,
\end{cases}</math>
which cannot be correct, since a cumulative distribution function must be [[right-continuous]]!
 
This paradoxical result is explained by measure theory as follows. For a given ''y'' the corresponding {{nowrap begin}}''F''<sub>''Y''|''X''=''x''</sub>(''y'') = P ( ''Y'' ≤ ''y'' | ''X'' = ''x'' ){{nowrap end}} is well-defined (via the Hilbert space or the Radon–Nikodym derivative) as an equivalence class of functions (of ''x''). Treated as a function of ''y'' for a given ''x'' it is ill-defined unless some additional input is provided. Namely, a function (of ''x'') must be chosen within every (or at least almost every) equivalence class. Wrong choice leads to wrong conditional cumulative distribution functions.
 
A right choice can be made as follows. First, {{nowrap begin}}''F''<sub>''Y''|''X''=''x''</sub>(''y'') = P (''Y'' ≤ ''y'' | ''X'' = ''x''){{nowrap end}} is considered for rational numbers ''y'' only. (Any other dense countable set may be used equally well.) Thus, only a countable set of equivalence classes is used; all choices of functions within these classes are mutually equivalent, and the corresponding function of rational ''y'' is well-defined (for almost every ''x''). Second, the function is extended from rational numbers to real numbers by right continuity.
 
In general the conditional distribution is defined for almost all ''x'' (according to the distribution of ''X''), but sometimes the result is continuous in ''x'', in which case individual values are acceptable. In the considered example this is the case; the correct result for ''x'' = 0.75,
: <math>F_{Y|X=\frac{3}{4}} (y) = \mathbb{P} \left  ( Y \le y | X = \tfrac{3}{4} \right ) = \begin{cases}
0 &\text{for } -\infty < y < \tfrac{1}{4},\\
\tfrac{1}{3} &\text{for } \tfrac{1}{4}\le y < \tfrac{1}{2},\\
1 &\text{for } \tfrac{1}{2} \le y < \infty
\end{cases}</math>
shows that the conditional distribution of ''Y'' given ''X'' = 0.75 consists of two atoms, at 0.25 and 0.5, of probabilities 1/3 and 2/3 respectively.
 
Similarly, the conditional distribution may be calculated for all ''x'' in (0, 0.5) or (0.5, 1).
 
The value ''x'' = 0.5 is an atom of the distribution of ''X'', thus, the corresponding conditional distribution is well-defined and may be calculated by elementary means (the denominator does not vanish); the conditional distribution of ''Y'' given ''X'' = 0.5 is uniform on (2/3, 1). Measure theory leads to the same result.
 
The mixture of all conditional distributions is the (unconditional) distribution of ''Y''.
 
The conditional expectation {{nowrap begin}}E ( ''Y'' | ''X'' = ''x'' ){{nowrap end}} is nothing but the expectation with respect to the conditional distribution.
 
In the case ''f'' = ''f''<sub>2</sub> the corresponding {{nowrap begin}}''F''<sub>''Y''|''X''=''x''</sub>(y) = P(''Y'' ≤ ''y'' | ''X'' = ''x''){{nowrap end}} probably cannot be calculated explicitly. For a given ''y'' it is well-defined (via the Hilbert space or the Radon–Nikodym derivative) as an equivalence class of functions (of ''x''). The right choice of functions within these equivalence classes may be made as above; it leads to correct conditional cumulative distribution functions, thus, conditional distributions. In general, conditional distributions need not be [[discrete probability distribution|atomic]] or [[Absolutely continuous random variable|absolutely continuous]] (nor mixtures of both types). Probably, in the considered example they are [[Singular distribution|singular]] (like the [[Cantor distribution]]).
 
Once again, the mixture of all conditional distributions is the (unconditional) distribution, and the conditional expectation is the expectation with respect to the conditional distribution.
 
==Technical details==
<references group="details" />
 
==See also==
* [[Conditional probability]]
* [[Conditional expectation]]
* [[Conditional probability distribution]]
* [[Joint probability distribution]]
* [[Borel's paradox]]
* [[Regular conditional probability]]
* [[Disintegration theorem]]
* [[Law of total variance]]
* [[Law of total cumulance]]
 
==Notes==
<references />
 
==References==
*{{citation|last=Durrett|first=Richard|author-link=Rick Durrett|title=Probability: theory and examples|edition=Second|year=1996}}
*{{citation|last=Pollard|first=David|title=A user's guide to measure theoretic probability|year=2002|publisher=Cambridge University Press}}
 
[[Category:Probability theory]]

Latest revision as of 20:03, 3 September 2014

Handbags is one of the very most functional accessories.

Girls choose plenty of effort selecting their pocketbook to match their ensembles. Additionally they take much time to choose the pocket book together with the colour of their outfit. For folks who do believe in taking staff that is an excessive amount of with them, there are vibrant and attractive coin purses and clutch bags to choose from.

In contrast to a pocketbook which has straps clutch bags don't have some handles or straps. They are http://tinyurl.com/nqbkz6z primarily used as night bags to take make-up and only a few trinkets and are taken under the arm. Many individuals use pocket books to carry lots of items like pocketbooks, coin purses, appointment guides, makeup along with other items that are personal.
Organizer bags are perfect for this goal as these bags feature lots of convenient pockets and compartments. Handbags which feature a top, a flat-bottom and also a top close carry handle are satchels. The side profile of a satchel is generally triangular. Watch bracelet on the other hand is a tiny wallet or a clutch which is attached into a small loop that may go around your wrist.

A barrel bag however is cylindrical in form as its name suggests. You'll find exciting purses that are not wide and resemble a French bread loaf. These discount ugg boots bags are named as Baguettes. Perhaps you are seeking an exceptional and environmentally friendly straw or an elegant leather satchel or cloth bag or perhaps a http://tinyurl.com/nqbkz6z sequined bag that is fashionable and modern. You will find many handbag fashions and choose from them to fit your personality and add charming to your own ensembles.

Backpack handbag is a bigger bag which is supposed to be taken like a backpack. This tote design features two straps that you can either sling the bag on one or both sides of your shoulder. A purse that is classy brings to ugg boots a fantastic extent to nature and your style.
The purse style you choose from hundreds and thousands of designs and colors can be chosen by you. Each bag style comes with its attributes that are unique. Stores selling handbags, whether online or offline, feature a number of handbag designs from leading designer brands and local labels.

One of the fascinating handbag styles is the Hobo. Hobos are shoulder bags with a lengthy shoulder strap. They are scooped in the heart and are slouchy to take a look at. Evening clutch bags made from hard circumstance and adorned with sequins that are appealing, crystals and beads are identified as Minaudiere.
For individuals that aren't fond of pocketbooks dangling from their shoulders and also want their fingers to be free, cross body bags are the perfect selection. These bags can be worn over the body leaving the hands-free. Ideal Fanny pack also known as the waist pack ugg outlet for those who take part in outdoor activities including running and trekking, is the ideal choice.

This bag could be worn on the waist and is quite useful for carrying little, essential items. There are fabrics, designs, many attractive hand bag styles and colours accessible the marketplace. ugg boots Purses are certainly one of the accessories that are best and you can have a collection of them to be employed for different events and functions.