|
|
Line 1: |
Line 1: |
| {{Probability distribution |
| | Oscar is what my spouse loves to call me and I completely dig that title. California is our beginning location. His spouse doesn't like it the way he does but what he really likes doing is to do aerobics and he's been doing it for fairly a whilst. Hiring is my occupation.<br><br>Also visit my blog - [http://Doo.lu/dietmealdelivery74619 Doo.lu] |
| name =Yule–Simon|
| |
| type =mass|
| |
| pdf_image =[[File:Yule-Simon distribution PMF.svg|325px|Plot of the Yule–Simon PMF]]<br /><small>Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)</small>|
| |
| cdf_image =[[File:Yule-Simon distribution CMF.svg|325px|Plot of the Yule–Simon CMF]]<br /><small>Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)</small>|
| |
| parameters =<math>\rho>0\,</math> shape ([[real number|real]])|
| |
| support =<math>k \in \{1,2,\dots\}\,</math>|
| |
| pdf =<math>\rho\,\mathrm{B}(k, \rho+1)\,</math>|
| |
| cdf =<math>1 - k\,\mathrm{B}(k, \rho+1)\,</math>|
| |
| mean =<math>\frac{\rho}{\rho-1}\,</math> for <math>\rho>1\,</math>|
| |
| median =|
| |
| mode =<math>1\,</math>|
| |
| variance =<math>\frac{\rho^2}{(\rho-1)^2\;(\rho-2)}\,</math> for <math>\rho>2\,</math>|
| |
| skewness =<math>\frac{(\rho+1)^2\;\sqrt{\rho-2}}{(\rho-3)\;\rho}\,</math> for <math>\rho>3\,</math>|
| |
| kurtosis =<math>\rho+3+\frac{11\rho^3-49\rho-22} {(\rho-4)\;(\rho-3)\;\rho}\,</math> for <math>\rho>4\,</math>|
| |
| entropy =|
| |
| mgf =<math>\frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^t)\,e^t \,</math>|
| |
| char =<math>\frac{\rho}{\rho+1}\;{}_2F_1(1,1; \rho+2; e^{i\,t})\,e^{i\,t} \,</math>|
| |
| }}
| |
| In [[probability]] and [[statistics]], the '''Yule–Simon distribution''' is a [[discrete probability distribution]] named after [[Udny Yule]] and [[Herbert A. Simon]]. Simon originally called it the '''''Yule distribution'''''.<ref name=SimonBiomet>{{cite journal
| |
| | last = Simon
| |
| | first = H. A.
| |
| | title = On a class of skew distribution functions
| |
| | journal = Biometrika
| |
| | volume = 42
| |
| | pages = 425–440
| |
| | year = 1955
| |
| | doi = 10.1093/biomet/42.3-4.425
| |
| | issue = 3–4
| |
| }}</ref>
| |
| | |
| The [[probability mass function]] of the Yule–Simon (''ρ'') distribution is
| |
| | |
| :<math>f(k;\rho) = \rho\,\mathrm{B}(k, \rho+1), \,</math>
| |
| | |
| for [[integer]] <math>k \geq 1</math> and [[real number|real]] <math>\rho > 0</math>, where <math>\mathrm{B}</math> is the [[beta function]]. Equivalently the pmf can be written in terms of the [[Pochhammer symbol|falling factorial]] as
| |
| | |
| :<math>
| |
| f(k;\rho) = \frac{\rho\,\Gamma(\rho+1)}{(k+\rho)^{\underline{\rho+1}}}
| |
| ,
| |
| \,</math>
| |
| | |
| where <math>\Gamma</math> is the [[gamma function]]. Thus, if <math>\rho</math> is an integer,
| |
| | |
| :<math>
| |
| f(k;\rho) = \frac{\rho\,\rho!\,(k-1)!}{(k+\rho)!}
| |
| .
| |
| \,</math>
| |
| | |
| The parameter <math>\rho</math> can be estimated using a fixed point algorithm.<ref name=JMGGarcia>{{cite journal
| |
| | last = Garcia Garcia
| |
| | first = Juan Manuel
| |
| | title = A fixed-point algorithm to estimate the Yule-Simon distribution parameter
| |
| | journal = Applied Mathematics and Computation
| |
| | volume = 217
| |
| | issue = 21
| |
| | pages = 8560–8566
| |
| | year = 2011
| |
| | doi = 10.1016/j.amc.2011.03.092
| |
| }}</ref>
| |
| | |
| The probability mass function ''f'' has the property that for sufficiently large ''k'' we have
| |
| | |
| :<math>
| |
| f(k;\rho)
| |
| \approx \frac{\rho\,\Gamma(\rho+1)}{k^{\rho+1}}
| |
| \propto \frac{1}{k^{\rho+1}}
| |
| .
| |
| \,</math>
| |
| | |
| This means that the tail of the Yule–Simon distribution is a realization of [[Zipf's law]]: <math>f(k;\rho)</math> can be used to model, for example, the relative frequency of the <math>k</math>th most frequent word in a large collection of text, which according to Zipf's law is [[inversely proportional]] to a (typically small) power of <math>k</math>.
| |
| | |
| ==Occurrence==
| |
| | |
| The Yule–Simon distribution arose originally as the limiting distribution of a particular [[stochastic process]] studied by Yule as a model for the distribution of biological taxa and subtaxa.<ref name=YulePhilTrans>{{cite journal
| |
| | last = Yule
| |
| | first = G. U.
| |
| | title = A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S
| |
| | journal = [[Philosophical Transactions of the Royal Society B]]
| |
| | volume = 213
| |
| | pages = 21–87
| |
| | year = 1925
| |
| | doi = 10.1098/rstb.1925.0002
| |
| | issue = 402–410
| |
| }}</ref> Simon dubbed this process the "Yule process" but it is more commonly known today as a [[preferential attachment]] process.{{citation needed|date=July 2012}} The preferential attachment process is an [[urn problem|urn process]] in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.
| |
| | |
| The distribution also arises as a [[compound distribution]], in which the parameter of a [[geometric distribution]] is treated as a function of random variable having an [[exponential distribution]].{{citation needed|date=July 2012}} Specifically, assume that <math>W</math> follows an exponential distribution with [[scale parameter|scale]] <math>1/\rho</math> or rate <math>\rho</math>:
| |
| | |
| :<math>W \sim \mathrm{Exponential}(\rho)\,,</math>
| |
| with density
| |
| :<math>h(w;\rho) = \rho \, \exp(-\rho\,w)\, .</math>
| |
| | |
| Then a Yule–Simon distributed variable ''K'' has the following geometric distribution conditional on ''W'':
| |
| | |
| :<math>K \sim \mathrm{Geometric}(\exp(-W))\, .</math>
| |
| | |
| The pmf of a geometric distribution is
| |
| | |
| :<math>g(k; p) = p \, (1-p)^{k-1}\,</math>
| |
| | |
| for <math>k\in\{1,2,\dots\}</math>. The Yule–Simon pmf is then the following exponential-geometric compound distribution:
| |
| | |
| :<math>f(k;\rho)
| |
| = \int_0^{\infty} \,\,\, g(k;\exp(-w))\,h(w;\rho)\,dw
| |
| \, .</math>
| |
| | |
| ==Generalizations==
| |
| | |
| The two-parameter generalization of the original Yule distribution replaces the beta function with an [[incomplete beta function]]. The probability mass function of the generalized Yule–Simon(''ρ'', ''α'') distribution is defined as
| |
| | |
| :<math> | |
| f(k;\rho,\alpha) = \frac{\rho}{1-\alpha^{\rho}} \;
| |
| \mathrm{B}_{1-\alpha}(k, \rho+1)
| |
| ,
| |
| \,</math>
| |
| | |
| with <math>0 \leq \alpha < 1</math>. For <math>\alpha = 0</math> the ordinary Yule–Simon(''ρ'') distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
| |
| | |
| <!-- This image doesn't seem to be attached to anything in the text:
| |
| [[File:Yule-Simon distribution.png|thumb|300px|Plot of the Yule–Simon(1) distribution (red) and its asymptotic Zipf law (blue)]]
| |
| -->
| |
| | |
| ==See also==
| |
| * [[Beta function]]
| |
| * [[Preferential attachment]]
| |
| | |
| ==Bibliography==
| |
| * Colin Rose and Murray D. Smith, ''Mathematical Statistics with Mathematica''. New York: Springer, 2002, ISBN 0-387-95234-9. (''See page 107, where it is called the "Yule distribution".'')
| |
| | |
| ==References==
| |
| <references />
| |
| | |
| {{ProbDistributions|Yule–Simon distribution}}
| |
| | |
| {{DEFAULTSORT:Yule-Simon Distribution}}
| |
| [[Category:Discrete distributions]]
| |
| [[Category:Compound distributions]]
| |
| [[Category:Probability distributions]]
| |
Oscar is what my spouse loves to call me and I completely dig that title. California is our beginning location. His spouse doesn't like it the way he does but what he really likes doing is to do aerobics and he's been doing it for fairly a whilst. Hiring is my occupation.
Also visit my blog - Doo.lu