Perturbation theory: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Mhilferink
mNo edit summary
Line 1: Line 1:
{{More footnotes|date=September 2011}}
My name is Dane and I am studying Environmental Management and Athletics and Physical Education at Juazeiro / Brazil.<br><br>Feel free to surf to my website :: [http://marbellaclassified.com/2014/08/21/fifa-15-coin-hack/ FIFA 15 Coin Hack]
 
In [[probability]] and [[statistics]], a '''Bernoulli process''' is a finite or infinite sequence of binary [[random variable]]s, so it is a [[discrete-time stochastic process]] that takes only two values, canonically 0 and&nbsp;1. The component '''Bernoulli variables''' ''X''<sub>''i''</sub> are identical and [[statistical independence|independent]].  Prosaically, a Bernoulli process is a repeated [[coin flipping]], possibly with an unfair coin (but with consistent unfairness).  Every variable ''X''<sub>''i''</sub> in the sequence is associated with a [[Bernoulli trial]] or experiment. They all have the same [[Bernoulli distribution]]. Much of what can be said about the Bernoulli process can also be generalized to more than two outcomes (such as the process for a six-sided die); this generalization is known as the [[Bernoulli scheme]].
 
The problem of determining the process, given only a limited sample of the Bernoulli trials, may be called the problem of [[checking whether a coin is fair]].
 
==Definition==
A '''Bernoulli process''' is a finite or infinite sequence of [[statistical independence|independent]] [[random variable]]s ''X''<sub>1</sub>,&nbsp;''X''<sub>2</sub>,&nbsp;''X''<sub>3</sub>,&nbsp;..., such that
 
* For each ''i'', the value of ''X''<sub>''i''</sub> is either 0 or&nbsp;1;
* For all values of ''i'', the probability that ''X''<sub>''i''</sub>&nbsp;=&nbsp;1 is the same number&nbsp;''p''.
 
In other words, a Bernoulli process is a sequence of [[independent identically distributed]] [[Bernoulli trial]]s.
 
Independence of the trials implies that the process is memoryless. Given that the probability ''p'' is known, past outcomes provide no information about future outcomes. (If ''p'' is unknown, however, the past informs about the future indirectly, through inferences about&nbsp;''p''.)
 
If the process is infinite, then from any point the future trials constitute a Bernoulli process identical to the whole process, the fresh-start property.
 
=== Interpretation ===
 
The two possible values of each ''X''<sub>''i''</sub> are often called "success" and "failure". Thus, when expressed as a number 0 or 1, the outcome may be called the number of successes on the ''i''th "trial".
 
Two other common interpretations of the values are true or false and yes or no. Under any interpretation of the two values, the individual variables ''X''<sub>''i''</sub> may be called [[Bernoulli trial]]s with parameter p.
 
In many applications time passes between trials, as the index i increases. In effect, the trials ''X''<sub>1</sub>,&nbsp;''X''<sub>2</sub>,&nbsp;...&nbsp;''X''<sub>i</sub>,&nbsp;... happen at "points in time" 1,&nbsp;2,&nbsp;...,&nbsp;''i'',&nbsp;.... That passage of time and the associated notions of "past" and "future" are not necessary, however. Most generally, any ''X''<sub>i</sub> and ''X''<sub>''j''</sub> in the process are simply two from a set of random variables indexed by {1,&nbsp;2,&nbsp;...,&nbsp;''n''} or by {1,&nbsp;2,&nbsp;3,&nbsp;...}, the finite and infinite cases.
 
Several random variables and probability distributions beside the Bernoullis may be derived from the Bernoulli process:
 
*The number of successes in the first ''n'' trials, which has a [[binomial distribution]] B(''n'',&nbsp;''p'')
*The number of trials needed to get ''r'' successes, which has a [[negative binomial distribution]] NB(''r'',&nbsp;''p'')
*The number of trials needed to get one success, which has a [[geometric distribution]] NB(1,&nbsp;''p''), a special case of the negative binomial distribution
 
The negative binomial variables may be interpreted as random [[Negative binomial distribution#Waiting time in a Bernoulli process|waiting times]].
 
==Formal definition==
The Bernoulli process can be formalized in the language of [[probability space]]s as a random sequence  of independent realisations of a random variable that can take values of heads or tails. The state space for an individual value is denoted by <math>2=\{H,T\} .</math>
 
Specifically, one considers the [[countably infinite]] [[direct product]] of copies of <math>2=\{H,T\}</math>.  It is common to examine either the one-sided set <math>\Omega=2^\mathbb{N}=\{H,T\}^\mathbb{N}</math> or the two-sided set <math>\Omega=2^\mathbb{Z}</math>. There is a natural [[topology]] on this space, called the [[product topology]]. The sets in this topology are finite sequences of coin flips, that is, finite-length [[string (computer science)|strings]] of ''H'' and ''T'', with the rest of (infinitely long) sequence taken as "don't care".  These sets of finite sequences are referred to as [[cylinder set]]s in the product topology.  The set of all such strings form a [[sigma algebra]], specifically, a [[Borel algebra]].  This algebra is then commonly written as <math>(\Omega, \mathcal{F})</math> where the elements of <math>\mathcal{F}</math> are the finite-length sequences of coin flips (the cylinder sets). 
 
If the chances of flipping heads or tails are given by the probabilities <math>\{p,1-p\}</math>, then one can define a natural [[measure (mathematics)|measure]] on the product space, given by <math>P=\{p, 1-p\}^\mathbb{N}</math> (or by <math>P=\{p, 1-p\}^\mathbb{Z}</math> for the two-sided process).  Given a cylinder set, that is, a specific sequence of coin flip results <math>[\omega_1, \omega_2,\cdots\omega_n]</math> at times <math>1,2,\cdots,n</math>, the probability of observing this particular sequence is given by
:<math>P([\omega_1, \omega_2,\cdots ,\omega_n])= p^k (1-p)^{n-k}</math>
where ''k'' is the number of times that ''H'' appears in the sequence, and ''n''-''k'' is the number of times that ''T'' appears in the sequence. There are several different kinds of notations for the above; a common one is to write
:<math>P(X_1=\omega_1, X_2=\omega_2,\cdots, X_n=\omega_n)= p^k (1-p)^{n-k}</math>
where each <math>X_i</math> is a binary-valued [[random variable]]. It is common to write <math>x_i</math> for <math>\omega_i</math>.  This probability ''P'' is commonly called the [[Bernoulli measure]].<ref name=klenke>Achim Klenke, ''Probability Theory'', (2006) Springer-Verlag ISBN 978-1-848000-047-6 doi:10.1007/978-1-848000-048-3</ref>
 
Note that the probability of any specific, infinitely long sequence of coin flips is exactly zero; this is because <math>\lim_{n\to\infty}p^n=0</math>, for any <math>0\le p<1</math>. One{{who|date=April 2013}} says that any given infinite sequence has [[measure zero]].  Nevertheless, one can still say that some classes of infinite sequences of coin flips are far more likely than others, this is given by the [[asymptotic equipartition property]].
 
To conclude the formal definition, a Bernoulli process is then given by the probability triple <math>(\Omega, \mathcal{F}, P)</math>, as defined above.
 
== Binomial distribution==
:{{main| Central limit theorem|Binomial distribution}}
The [[law of large numbers]] states that, on average, the [[expectation value]] of flipping ''heads'' for any one coin flip is ''p''. That is, one writes
:<math>E[X_i=H]=p</math>
for any one given random variable <math>X_i</math> out of the infinite sequence of [[Bernoulli trial]]s that compose the Bernoulli process.
 
One is often interested in knowing how often one will observe ''H'' in a sequence of ''n'' coin flips.  This is given by simply counting:  Given ''n'' successive coin flips, that is, given the set of all possible [[string (computer science)|strings]] of length ''n'', the number ''N''(''k'',''n'') of such strings that contain ''k'' occurrences of ''H'' is given by the [[binomial coefficient]]
:<math>N(k,n) = {n \choose k}=\frac{n!}{k! (n-k)!}</math>
 
If the probability of flipping heads is given by ''p'', then the total probability of seeing a string of length ''n'' with ''k'' heads is
 
:<math>E[X_i=H\mbox{ k out of n times}]=
P(k,n)={n\choose k} p^k (1-p)^{n-k}</math>
This probability is known as the [[Binomial distribution]].
 
Of particular interest is the question of the value of ''P''(''k'',''n'') for very, very long sequences of coin flips, that is, for the limit <math>n\to\infty</math>.  In this case, one may make use of [[Stirling's approximation]] to the factorial, and write
:<math>n! = \sqrt{2\pi n} \;n^n e^{-n}
\left(1 + \mathcal{O}\left(\frac{1}{n}\right)\right)</math>
 
Inserting this into the expression for ''P''(''k'',''n''), one obtains the [[Gaussian distribution]]; this is the content of the [[central limit theorem]], and this is the simplest example thereof. 
 
The combination of the law of large numbers, together with the central limit theorem, leads to an interesting and perhaps surprising result: the [[asymptotic equipartition property]].  Put informally, one notes that, yes, over many coin flips, one will observe ''H'' exactly ''p'' fraction of the time, and that this corresponds exactly with the peak of the Gaussian.  The asymptotic equipartition property essentially states that this peak is infinitely sharp, with infinite fall-off on either side.  That is, given the set of all possible infinitely long strings of ''H'' and ''T'' occurring in the Bernoulli process, this set is partitioned into two: those strings that occur with probability 1, and those that occur with probability 0.  This partitioning is known as the [[Kolmogorov 0-1 law]].
 
The size of this set is interesting, also, and can be explicitly determined: the logarithm of it is exactly the [[information entropy|entropy]] of the Bernoulli process.  Once again, consider the set of all strings of length ''n''. The size of this set is <math>2^n</math>.  Of these, only a certain subset are likely; the size of this set is <math>2^{nH}</math> for <math>H\le 1</math>.  By using Stirling's approximation, putting it into the expression for ''P''(''k'',''n''), solving for the location and width of the peak, and finally taking <math>n\to\infty</math> one finds that
 
:<math>H=-p\log_2 p - (1-p)\log_2(1-p)</math>
 
This value is the [[Bernoulli entropy]] of a Bernoulli process. Here, ''H'' stands for entropy; do not confuse it with the same symbol ''H'' standing for ''heads''.
 
[[von Neumann]] posed a curious question about the Bernoulli process: is it ever possible that a given process is [[isomorphic]] to another, in the sense of the [[isomorphism of dynamical systems]]? The question  long defied analysis, but was finally and completely answered with the [[Ornstein isomorphism theorem]].  This breakthrough resulted in the understanding that the Bernoulli process is unique and [[universal property|universal]]; in a certain sense, it is the single most random process possible; nothing is 'more' random than the Bernoulli process (although one must be careful with this informal statement; certainly, systems that are [[mixing (mathematics)|mixing]] are, in a certain sense, 'stronger' than the Bernoulli process, which is merely ergodic but not mixing. However, such processes do not consist of independent random variables: indeed, many purely deterministic, non-random systems can be mixing).
 
==As a dynamical system==
:{{main|Bernoulli scheme}}
The Bernoulli process can also be understood to be a [[dynamical system]], specifically, a [[measure-preserving dynamical system]].  This arises because there is a natural translation symmetry on the (two-sided) product space <math>\Omega=2^\mathbb{Z}</math> given by the [[shift operator]]
:<math>TX_i = X_{i+1}</math>
 
The measure is translation-invariant; that is, given any cylinder set <math>\omega\in\Omega</math>, one has
:<math>P(T\omega)=P(\omega)</math>
and thus the [[Bernoulli measure]] is a [[Haar measure]].
 
The shift operator should be understood to be an operator acting on the sigma algebra <math>(\Omega, \mathcal{F})</math>, so that one has
:<math>T:\mathcal{F}\to\mathcal{F}</math>
In this guise, the shift operator is known as the [[transfer operator]] or the ''Ruelle-Frobenius-Perron operator''.  It is interesting to consider the [[eigenfunction]]s of this operator, and how they differ when restricted to different subspaces of <math>(\Omega, \mathcal{F})</math>.  When restricted to the standard topology of the real numbers, the eigenfunctions are curiously the [[Bernoulli polynomial]]s!<ref>Pierre Gaspard, "''r''-adic one-dimensional maps and the Euler summation formula", ''Journal of Physics A'', '''25''' (letter) L483-L485 (1992).</ref><ref>Dean J. Driebe, Fully Chaotic Maps and Broken Time Symmetry, (1999) Kluwer Academic Publishers, Dordrecht Netherlands ISBN 0-7923-5564-4</ref>  This coincidence of naming was presumably not known to Bernoulli.
 
==Bernoulli sequence==
The term '''Bernoulli sequence''' is often used informally to refer to a [[realization (probability)|realization]] of a Bernoulli process.
However, the term has an entirely different formal definition as given below.
 
Suppose a Bernoulli process formally defined as a single random variable (see preceding section). For every infinite sequence ''x'' of coin flips, there is a [[sequence]] of integers
 
:<math>\mathbb{Z}^x = \{n\in \mathbb{Z} : X_n(x) = 1 \} \, </math>
 
called the '''Bernoulli sequence'''{{Verify source|date=March 2010}} associated with the Bernoulli process. For example, if ''x'' represents a sequence of coin flips, then the associated Bernoulli sequence is the list of natural numbers or time-points for which the coin toss outcome is ''heads''.
 
So defined, a Bernoulli sequence <math>\mathbb{Z}^x</math> is also a random subset of the index set, the natural numbers <math>\mathbb{N}</math>.
 
[[Almost all]] Bernoulli sequences <math>\mathbb{Z}^x</math> are [[ergodic sequence]]s.{{Verify source|date=March 2010}}
 
==Randomness extraction==
{{Main|Randomness extractor}}
From any Bernoulli process one may derive a Bernoulli process with ''p''&nbsp;=&nbsp;1/2 by the [[von Neumann extractor]], the earliest [[randomness extractor]], which actually extracts uniform randomness.
 
=== Basic Von Neumann extractor ===
Represent the observed process as a sequence of zeroes and ones, or bits, and group that input stream in non-overlapping pairs of successive bits, such as (11)(00)(10)... . Then for each pair,
* if the bits are equal, discard;
* if the bits are not equal, output the first bit.
 
This table summarizes the computation.
{|
! input !! output
|-
| 00 || discard
|-
| 01 || 0
|-
| 10 || 1
|-
| 11 || discard
|}
For example, an input stream of eight bits '''10011011''' would by grouped into pairs as '''(10)(01)(10)(11)'''. Then, according to the table above, these pairs are translated into the output of the procedure:
'''(1)(0)(1)()''' (='''101''').
 
In the output stream 0 and 1 are equally likely, as 10 and 01 are equally likely in the original, both having probability ''pq''&nbsp;=&nbsp;''qp.'' This extraction of uniform randomness does not require the input trials to be independent, only [[uncorrelated]]. More generally, it works for any [[exchangeable random variables|exchangeable sequence]] of bits: all sequences that are finite rearrangements are equally likely.
 
The Von Neumann extractor uses two input bits to produce either zero or one output bits, so the output is shorter than the input by a factor of at least&nbsp;2. On average the computation discards proportion ''p''<sup>2</sup>&nbsp;+&nbsp;(1&nbsp;&minus;&nbsp;''p'')<sup>2</sup> of the input pairs, or proportion ''p''<sup>2</sup>&nbsp;+&nbsp;''q''<sup>2</sup>, which is near one when ''p'' is near zero or one.
 
The discard of input pairs is at least proportion 1/2, the minimum which occurs where ''p''&nbsp;=&nbsp;1/2 for the original process. In that case the output stream is 1/4 the length of the input on average.
 
=== Iterated Von Neumann extractor ===
{{Cite check|section|date=January 2014|talk=Iterated Von Neumann extractor}}
This decrease in efficiency, or waste of randomness present in the input stream, can be mitigated by iterating the algorithm over the input data. This way the output can be made to be "arbitrarily close to the entropy bound".<ref>{{cite journal|last=Peres|first=Yuval|title=Iterating Von Neumann's Procedure for Extracting Random Bits|journal=The Annals of Statistics|date=March 1992|volume=20|issue=1|pages=590–597|url=http://www.stat.berkeley.edu/~peres/mine/vn.pdf|accessdate=30 May 2013}}</ref> In short, the algorithm can be described as: For every pair (xy) of bits consumed from the input stream append ''one'' of them to the input again; it does not matter which bit from (xy) is appended, either bit ''x'' or ''y'' may be used. This way, for each iteration, ''two'' bits are first removed from the input and then ''one'' bit is added to the input again, so that, as a result, the input is only shortened by ''one'' bit per iteration as opposed to two bits in the original approach. The table from above thus changes with this approach:
 
{|
! input !! output !! appended
|-
| 00 || ''none'' || 0
|-
| 01 || 0 || 1
|-
| 10 || 1 || 0
|-
| 11 || ''none'' || 1
|}
(In this case, from each pair (xy) the ''y'' bit is appended, but the ''x'' bit could have been chosen alternatively without affecting the randomness of the output.)
 
Example: The input stream from above, '''10011011''', is processed this way:
{|
! step !! input !! output !! appended
|-
| 1 || '''{{uline|10}}11011''' || 1 || 0
|-
| 2 || '''{{uline|01}}1011'''0 || 0 || 1
|-
| 3 || '''{{uline|10}}11'''01 || 1 || 0
|-
| 4 || '''{{uline|11}}'''010 || ''none'' || 1
|-
| 5 || {{uline|01}}01 || 0 || 1
|-
| 6 || {{uline|01}}1 || 0 || 1
|-
| 7 || {{uline|11}} || ''none'' || 1
|-
| 8 || 1 || ''terminate'' || ''none''
|}
 
The output is therefore '''(1)(0)(1)()(0)(0)()''' (='''10100'''), so that from the eight bits of input five bits of output were generated, as opposed to three bits through the basic algorithm above.
 
==References==
{{reflist}}
 
==Further reading==
* Carl W. Helstrom, ''Probability and Stochastic Processes for Engineers'', (1984) Macmillan Publishing Company, New York ISBN 0-02-353560-1.
* Dimitri P. Bertsekas and John N. Tsitsiklis, ''Introduction to Probability'', (2002) Athena Scientific, Massachusetts ISBN 1-886529-40-X
 
==External links==
* [http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/ Using a binary tree diagram for describing a Bernoulli process]
 
{{Stochastic processes}}
 
[[Category:Stochastic processes]]

Revision as of 14:16, 26 February 2014

My name is Dane and I am studying Environmental Management and Athletics and Physical Education at Juazeiro / Brazil.

Feel free to surf to my website :: FIFA 15 Coin Hack