|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| {{Cleanup|date=March 2008}}
| | Hello! My name is Magdalena. <br>It is a little about myself: I live in Netherlands, my city of Lunteren. <br>It's called often Northern or cultural capital of GE. I've married 3 years ago.<br>I have 2 children - a son (Magdalena) and the daughter (Ned). We all like Tennis.<br><br>Also visit my blog post; [http://tinyurl.com/mgupjtd UGG Boots Outlet] |
| | |
| In [[statistics]], the '''multinomial test''' is the test of the [[null hypothesis]] that the parameters of a [[multinomial distribution]] equal specified values. It is used for categorical data; see Read and Cressie.<ref>Read, T. R. C. and Cressie, N. A. C. (1988). ''Goodness-of-fit statistics for discrete multivariate data.'' New York: Springer-Verlag. ISBN 0-387-96682-X.</ref>
| |
| | |
| We begin with a sample of <math>N</math> items each of which has been observed to fall into one of <math>k</math> categories. We can define <math>\mathbf{x} = (x_1, x_2, \dots, x_k)</math> as the observed numbers of items in each cell. Hence <math>\textstyle \sum_{i=1}^k x_{i} = N</math>.
| |
| | |
| Next, we define a vector of parameters <math>H_0: \mathbf{\pi} = (\pi_{1}, \pi_{2}, \dots, \pi_{k})</math>, where :<math>\textstyle \sum_{i=1}^k \pi_{i} = 1</math>. These are the parameter values under the [[null hypothesis]].
| |
| | |
| The exact probability of the observed configuration <math>\mathbf{x} </math> under the null hypothesis is given by
| |
| | |
| :<math>\Pr(\mathbf{x)_0} = N! \prod_{i=1}^k \frac{\pi_{i}^{x_i}}{x_i!}.</math>
| |
| | |
| The significance probability for the test is the probability of occurrence of the data set observed, or of a data set less likely than that observed, if the null hypothesis is true. Using an [[exact test]], this is calculated as
| |
| | |
| :<math>\Pr(\mathbf{sig})=\sum_{y: Pr(\mathbf{y}) \le Pr(\mathbf{x)_0}} \Pr(\mathbf{y})</math>
| |
| | |
| where the sum ranges over all outcomes as likely as, or less likely than, that observed. In practice this becomes computationally onerous as <math>k</math> and <math>N</math> increase so it is probably only worth using exact tests for small samples. For larger samples, asymptotic approximations are accurate enough and easier to calculate.
| |
| | |
| One of these approximations is the [[Likelihood-ratio test|likelihood ratio]]. We set up an [[alternative hypothesis]] under which each value <math>\pi_{i}</math> is replaced by its maximum likelihood estimate <math>p_{i}=x_{i}/N</math>. The exact probability of the observed configuration <math>\mathbf{x} </math> under the alternative hypothesis is given by
| |
| | |
| : <math>\Pr(\mathbf{x)_A} = N! \prod_{i=1}^k \frac{p_{i}^{x_i}}{x_i!}.</math>
| |
| | |
| The natural logarithm of the ratio between these two probabilities multiplied by <math>-2</math> is then the statistic for the [[likelihood ratio test]]
| |
| | |
| : <math>-2\ln(LR) = \textstyle -2\sum_{i=1}^k x_{i}\ln(\pi_{i}/p_{i}) .</math>
| |
| | |
| If the null hypothesis is true, then as <math>N</math> increases, the distribution of <math>-2\ln(LR) </math> converges to that of [[Chi-squared distribution|chi-squared]] with <math>k-1</math> degrees of freedom. However it has long been known (e.g. Lawley 1956) that for finite sample sizes, the moments of <math>-2\ln(LR) </math> are greater than those of chi-squared, thus inflating the probability of [[Type I and type II errors|type I errors]] (false positives). The difference between the moments of chi-squared and those of the test statistic are a function of <math>N^{-1}</math>. Williams (1976) showed that the first moment can be matched as far as <math>N^{-2}</math> if the test statistic is divided by a factor given by
| |
| | |
| : <math>q_1 = 1+\frac{\sum_{i=1}^k \pi_{i}^{-1}-1}{6N(k-1)}. </math>
| |
| | |
| In the special case where the null hypothesis is that all the values <math>\pi_{i}</math> are equal to <math>1/k</math> (i.e. it stipulates a uniform distribution), this simplifies to
| |
| | |
| : <math>q_1 = 1+\frac{k+1}{6N}. </math>
| |
| | |
| Subsequently, Smith et al. (1981) derived a dividing factor which matches the first moment as far as <math>N^{-3}</math>. For the case of equal values of <math>\pi_{i}</math>, this factor is
| |
| | |
| : <math>q_2 = 1+\frac{k+1}{6N}+\frac{k^2}{6N^2}. </math>
| |
| | |
| The null hypothesis can also be tested by using [[Pearson's chi-squared test]]
| |
| | |
| :<math> \chi^2 = \sum_{i=1}^{k} {(x_i - E_i)^2 \over E_i}</math> | |
| | |
| where <math>E_i=N\pi_i</math> is the expected number of cases in category <math>i</math> under the null hypothesis. This statistic also converges to a chi-squared distribution with <math>k-1</math> degrees of freedom when the null hypothesis is true but does so from below, as it were, rather than from above as <math>-2\ln(LR) </math> does, so may be preferable to the uncorrected version of <math>-2\ln(LR) </math> for small samples.
| |
| | |
| == References ==
| |
| <references/>
| |
| *{{cite journal | author=Lawley, D. N. | title=A General Method of Approximating to the Distribution of Likelihood Ratio Criteria | journal=Biometrika | year=1956 | volume=43 | pages=295–303}}
| |
| *{{cite journal | author=Smith, P. J., Rae, D. S., Manderscheid, R. W. and Silbergeld, S.| title=Approximating the Moments and Distribution of the Likelihood Ratio Statistic for Multinomial Goodness of Fit | journal=Journal of the American Statistical Association | year=1981 | volume=76 | pages=737–740 | doi=10.2307/2287541 | issue=375 | publisher=American Statistical Association | jstor=2287541}}
| |
| *{{cite journal | author=Williams, D. A. | title=Improved Likelihood Ratio Tests for Complete Contingency Tables | journal=Biometrika | year=1976 | volume=63 | pages=33–37 | doi=10.1093/biomet/63.1.33}}
| |
| | |
| [[Category:Categorical data]]
| |
| [[Category:Statistical tests]]
| |
| [[Category:Non-parametric statistics]]
| |
Hello! My name is Magdalena.
It is a little about myself: I live in Netherlands, my city of Lunteren.
It's called often Northern or cultural capital of GE. I've married 3 years ago.
I have 2 children - a son (Magdalena) and the daughter (Ned). We all like Tennis.
Also visit my blog post; UGG Boots Outlet