|
|
Line 1: |
Line 1: |
| {{technical|date=January 2013}}
| | I'm Kendall and I live with my husband and our 3 children in Seedorf, in the south part. My hobbies are Squash, Chess and Machining.<br><br>Feel free to surf to my webpage: [http://colabogadosminpetrol.com/index.php?option=com_k2&view=item&id=34:pro Fifa 15 Coin Generator] |
| {{See also|Probit model}}
| |
| [[File:Probit plot.png|thumbnail|right|Plot of probit function]]
| |
| In [[probability theory]] and [[statistics]], the '''probit''' function is the [[quantile function]] associated with the standard [[normal distribution]]. It has applications in [[Q-Q plot|exploratory statistical graphics]] and specialized [[probit model|regression modeling of binary response variables]].
| |
| | |
| The standard [[normal distribution]] is commonly denoted as N(0,1) and its [[cumulative distribution function]] as <math>\Phi(z)</math>. As an example, consider the familiar fact that the standard normal distribution places 95% of probability between −1.96 and 1.96, and is symmetric around zero. It follows that
| |
| | |
| :<math>\Phi(-1.96) = 0.025 = 1-\Phi(1.96).\,\!</math>
| |
| | |
| The probit function gives the 'inverse' computation, generating a value of an N(0,1) random variable, associated with specified cumulative probability. Formally, the probit function is the inverse of <math>\Phi(z)</math>, denoted <math>\Phi^{-1}(p)</math>. Continuing the example,
| |
| | |
| :<math>\operatorname{probit}(0.025) = -1.96 = -\operatorname{probit}(0.975)</math>.
| |
| | |
| In general,
| |
| | |
| :<math> \Phi(\operatorname{probit}(p))=p</math>
| |
| :and
| |
| :<math>\operatorname{probit}(\Phi(z))=z.</math>
| |
| | |
| ==Conceptual development==
| |
| The idea of the probit function was published by [[Chester Ittner Bliss]] (1899–1979) in a 1934 article in ''[[Science (journal)|Science]]'' on how to treat data such as the percentage of a pest killed by a [[pesticide]].<ref>{{cite journal | journal=Science | volume=79 | issue=2037 | pages=38–39 | year=1934 | author=Bliss CI. | title=The method of probits | pmid=17813446 | doi =10.1126/science.79.2037.38 |jstor=1659792 }}</ref> Bliss proposed transforming the percentage killed into a "probability unit" (or "probit") which was linearly related to the modern definition (he defined it arbitrarily as equal to 0 for 0.0001 and 10 for 0.9999). He included a table to aid other researchers to convert their kill percentages to his probit, which they could then plot against the logarithm of the dose and thereby, it was hoped, obtain a more or less straight line. Such a so-called [[probit model]] is still important in toxicology, as well as other fields. The approach is justified in particular if response variation can be rationalized as a [[lognormal]] distribution of tolerances among subjects on test, where the tolerance of a particular subject is the dose just sufficient for the response of interest.
| |
| | |
| The method introduced by Bliss was carried forward in ''Probit Analysis'', an important text on toxicological applications by [[D. J. Finney]].<ref>Finney, D.J. (1947), ''Probit Analysis''. (1st edition) Cambridge University Press, Cambridge, UK.</ref><ref>{{cite book| author=Finney, D.J. | year=1971 | title=Probit Analysis (3rd edition)| publisher= Cambridge University Press, Cambridge, UK| isbn=0-521-08041-X| oclc=174198382 }}</ref> Values tabled by Finney can be derived from probits as defined here by adding a value of 5. This distinction is summarized by Collett (p. 55):<ref>{{cite book | author = Collett, D. | year=1991 | title=Modelling Binary Data | publisher=Chapman and Hall / CRC}}</ref> "The original definition of a probit [with 5 added was] primarily to avoid having to work with negative probits; ... This definition is still used in some quarters, but in the major statistical software packages for what is referred to as '''probit analysis''', probits are defined without the addition of 5." It should be observed that probit methodology, including numerical optimization for fitting of probit functions, was introduced before widespread availability of electronic computing. When using tables, it was convenient to have probits uniformly positive. Common areas of application do not require positive probits.
| |
| | |
| ==Diagnosing deviation of a distribution from normality==
| |
| {{main | Q-Q plot}}
| |
| In addition to providing a basis for important types of regression, the probit function is useful in statistical analysis for diagnosing deviation from normality, according to the method of Q-Q plotting. If a set of data is actually a [[Sample (statistics)|sample]] of a [[normal distribution]], a plot of the values against their probit scores will be approximately linear. Specific deviations from normality such as [[skewness|asymmetry]], [[kurtosis|heavy tails]], or [[bimodal distribution|bimodality]] can be diagnosed based on detection of specific deviations from linearity. While the Q-Q plot can be used for comparison to any distribution family (not only the normal), the normal Q-Q plot is a relatively standard exploratory data analysis procedure because the assumption of normality is often a starting point for analysis.
| |
| | |
| ==Computation==
| |
| The normal distribution CDF and its inverse are not available in [[closed-form expression|closed form]], and computation requires careful use of numerical procedures. However, the functions are widely available in software for statistics and probability modeling, and in spreadsheets. In [[Microsoft Excel]], for example, the probit function is available as normsinv(p). In computing environments where numerical implementations of the [[error function|inverse error function]] are available, the probit function may be obtained as
| |
| :<math> | |
| \operatorname{probit}(p) = \sqrt{2}\,\operatorname{erf}^{-1}(2p-1).
| |
| </math>
| |
| An example is [[MATLAB]], where an 'erfinv' function is available. The language [[Mathematica]] implements 'InverseErf'. Other environments directly implement the probit function as is shown in the following session in the [[R programming language]].
| |
| <source lang="rsplus">
| |
| > qnorm(0.025)
| |
| [1] -1.959964
| |
| > pnorm(-1.96)
| |
| [1] 0.02499790
| |
| </source>
| |
| | |
| Details for computing the inverse error function can be found at [http://home.online.no/~pjacklam/notes/invnorm/]. Wichura gives a fast algorithm for computing the probit function to 16 decimal places; this is used in R to generate random variates for the normal distribution.<ref>{{cite journal |author=Wichura, M.J. |year=1988 |title=Algorithm AS241: The Percentage Points of the Normal Distribution |journal=Applied Statistics |volume=37 |pages=477–484 |doi=10.2307/2347330 |jstor=2347330 |issue=3 |publisher=Blackwell Publishing}}</ref>
| |
| | |
| ===An ordinary differential equation for the probit function===
| |
| Another means of computation is based on forming a non-linear ordinary differential equation for probit, as per the Steinbrecher and Shaw method.<ref>{{cite journal| author= Steinbrecher, G., Shaw, W.T. | year=2008 | title=Quantile mechanics| journal= European Journal of Applied Mathematics| volume= 19 | issue=2| pages=87–112| doi = 10.1017/S0956792508007341 }}</ref> Abbreviating the probit function as <math>w(p)</math>, the ODE is
| |
| | |
| :<math>\frac{d w}{d p} = \frac{1}{f(w)} </math>
| |
| where <math>f(w)</math> is the probability density function of <math>w</math>.
| |
| | |
| In the case of the Gaussian:
| |
| :<math>\frac{d w}{d p} = \sqrt{\frac{2}{\pi }} \ e^{\frac{w^2}{2}} </math>
| |
| | |
| Differentiating again:
| |
| | |
| :<math>\frac{d^2 w}{d p^2} = w \left(\frac{d w}{d p}\right)^2 </math>
| |
| | |
| with the centre (initial) conditions
| |
| | |
| :<math>w\left(1/2\right) = 0,</math>
| |
| | |
| :<math>w'\left(1/2\right) = \sqrt{2\pi}.</math>
| |
| | |
| This equation may be solved by several methods, including the classical power series approach. From this, solutions of arbitrarily high accuracy may be developed based on Steinbrecher's approach to the series for the inverse error function. The power series solution is given by
| |
| | |
| :<math> w(p) = \sqrt \frac{\pi}{2} \sum_{k=0}^{\infty} \frac{d_k}{(2k+1)}(2p-1)^{(2k+1)} </math>
| |
| | |
| where the coefficients <math>d_k </math> satisfy the non-linear recurrence
| |
| | |
| :<math>d_{k+1} = \frac{\pi}{4} \sum_{j=0}^k \frac{d_j d_{k-j}}{(j+1)(2j+1)}</math>
| |
| | |
| with <math>d_0=1</math>. In this form the ratio <math>d_{k+1}/d_k \rightarrow 1</math> as <math>k \rightarrow \infty</math>.
| |
| <!--- are these numerically stable? --->
| |
| | |
| == See also == | |
| [[File:Logit-probit.svg|right|300px|thumb|Comparison of the [[logit function]] with a scaled probit (i.e. the inverse [[cumulative distribution function|CDF]] of the [[normal distribution]]), comparing <math>\operatorname{logit}(x)</math> vs. <math>\Phi^{-1}(x)/\sqrt{\frac{\pi}{8}}</math>, which makes the slopes the same at the origin.]]
| |
| | |
| Closely related to the probit function (and [[probit model]]) are the [[logit]] function and [[logit model]]. The inverse of the logistic function is given by
| |
| | |
| :<math>\operatorname{logit}(p)=\log\left( \frac{p}{1-p} \right).</math>
| |
| | |
| Analogously to the probit model, we may assume that such a quantity is related linearly to a set of predictors, resulting in the [[logit model]], the basis in particular of [[logistic regression]] model, the most prevalent form of [[regression analysis]] for categorical response data. In current statistical practice, probit and logit regression models are often handled as cases of the [[generalized linear model]].
| |
| | |
| ==See also==
| |
| *[[Detection error tradeoff]] graphs (DET Graphs, an alternative to the ROC)
| |
| *[[Logistic regression]] (a.k.a. logit model)
| |
| *[[Logit]]
| |
| *[[Probit model]]
| |
| *[[Multinomial probit]]
| |
| *[[Q-Q plot]]
| |
| *[[Continuous function]]
| |
| *[[Monotonic function]]
| |
| *[[Quantile function]]
| |
| *[[Sigmoid function]]
| |
| *[[Rankit]] analysis, also developed by Chester Bliss
| |
| *[[Ridit scoring]]
| |
| | |
| ==References== | |
| {{reflist}}
| |
| | |
| [[Category:Statistical terminology]]
| |
| [[Category:Data analysis]]
| |
| [[Category:Single-equation methods (econometrics)]]
| |
| [[Category:Econometrics]]
| |
| [[Category:Normal distribution]]
| |
| [[Category:Statistical functions]]
| |
| [[Category:Probability theory]]
| |
| | |
| [[ru:Пробит регрессия]]
| |
I'm Kendall and I live with my husband and our 3 children in Seedorf, in the south part. My hobbies are Squash, Chess and Machining.
Feel free to surf to my webpage: Fifa 15 Coin Generator