Group testing: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Wcherowi
Rewrote lead to put into context
 
 
Line 1: Line 1:
It involves expertise and knowledge of various tools and technologies used for creating websites. Offshore expert Word - Press developers high level of interactivity, accessibility, functionality and usability of our website can add custom online to using. Should you go with simple HTML or use a platform like Wordpress. They found out all the possible information about bringing up your baby and save money at the same time. Over a million people are using Wordpress to blog and the number of Wordpress users is increasing every day. <br><br>Generally, for my private income-making market websites, I will thoroughly research and discover the leading 10 most worthwhile niches to venture into. Wordpress have every reason with it which promote wordpress development. It sorts the results of a search according to category, tags and comments. t need to use the back button or the URL to get to your home page. Aided by the completely foolproof j - Query color selector, you're able to change the colors of factors of your theme a the click on the screen, with very little previous web site design experience. <br><br>Just ensure that you hire experienced Word - Press CMS developer who is experienced enough to perform the task of Word - Press customization to get optimum benefits of Word - Press CMS. The only problem with most is that they only offer a monthly plan, you never own the software and you can’t even install the software on your site, you must go to another website to manage your list and edit your autoresponder. Those who cannot conceive with donor eggs due to some problems can also opt for surrogacy option using the services of surrogate mother. Enough automated blog posts plus a system keeps you and your clients happy. For any web design and development assignment, this is definitely one of the key concerns, specifically for online retail outlets as well as e-commerce websites. <br><br>If you have any sort of questions concerning where and the best ways to use [http://www.maxspa.ru/bitrix/rk.php?goto=https://wordpress.org/plugins/ready-backup/ wordpress backup], you can contact us at the web site. Digg Digg Social Sharing - This plugin that is accountable for the floating social icon located at the left aspect corner of just about every submit. * Robust CRM to control and connect with your subscribers. Thus it is difficult to outrank any one of these because of their different usages. If you choose a blog then people will be able to post articles on your site and people will be able to make comments on your posts (unless you turn comments off). Look for experience: When you are searching for a Word - Press developer you should always look at their experience level. <br><br>Millions of individuals and organizations are now successfully using this tool throughout the world. s ability to use different themes and skins known as Word - Press Templates or Themes. You can select color of your choice, graphics of your favorite, skins, photos, pages, etc. You should stay away from plugins that are full of flaws and bugs. You can check out the statistics of page of views for your web pages using free tools that are available on the internet.
{{cleanup|date=December 2010}}
{{Probability distribution
  | box_width  = 350px
  | type      = multivariate
  | notation  = <math>\textrm{NM}(k_0,\,p)</math>
  | parameters = ''k''<sub>0</sub> ∈ '''N'''<sub>0</sub> — the number of failures before the experiment is stopped,<br/>''p'' ∈ '''R'''<sup>''m''</sup> — ''m''-vector of “success” probabilities,<br/><hr>''p''<sub>0</sub> = 1 − (''p''<sub>1</sub>+…+''p''<sub>''m''</sub>) — the probability of a “failure”.</sup>
  | support    = <math>k_i \in \{0,1,2,\ldots\}, 1\leq i\leq m</math>
  | pdf        = <math>\Gamma\!\left(\sum_{i=0}^m{k_i}\right)\frac{p_0^{k_0}}{\Gamma(k_0)} \prod_{i=1}^m{\frac{p_i^{k_i}}{k_i!}},</math><br/> where Γ(''x'') is the [[Gamma function]].
  | cdf        =
  | mean      = <math> \tfrac{k_0}{p_0}\,p </math>
  | variance  = <math> \tfrac{k_0}{p_0^2}\,pp' + \tfrac{k_0}{p_0}\,\operatorname{diag}(p) </math>
  | mode      =
  | entropy    =
  | mgf        =
  | cf        = <math>\bigg(\frac{p_0}{1 - p'e^{it}}\bigg)^{\!k_0}</math>
  }}
 
In [[probability theory]] and [[statistics]], the '''negative multinomial distribution''' is a generalization of the [[negative binomial distribution]] (NB(''r'',&thinsp;''p'')) to more than two outcomes.<ref name="LeGall">Le Gall, F. The modes of a negative multinomial distribution, Statistics & Probability Letters, Volume 76, Issue 6, 15 March 2006, Pages 619-624, ISSN 0167-7152, [http://www.sciencedirect.com/science/article/B6V1D-4H7T8P0-1/2/54b376fc96fdd6ad4331325a822df997 10.1016/j.spl.2005.09.009].</ref>
 
Suppose we have an experiment that generates ''m''+1≥2 possible outcomes, {''X''<sub>0</sub>,…,''X''<sub>''m''</sub>}, each occurring with non-negative probabilities {''p''<sub>0</sub>,…,''p''<sub>''m''</sub>} respectively. If sampling proceeded until ''n'' observations were made, then {''X''<sub>0</sub>,…,''X''<sub>''m''</sub>} would have been [[multinomial distribution|multinomially distributed]]. However, if the experiment is stopped once ''X''<sub>0</sub> reaches the predetermined value ''k''<sub>0</sub>, then the distribution of the ''m''-tuple {''X''<sub>1</sub>,…,''X''<sub>''m''</sub>} is ''negative multinomial''.
 
==Negative multinomial distribution example==
The table below shows the an example of 400 [[Melanoma]] (skin cancer) Patients where the Type and Site of the cancer are recorded for each subject.
<center>
{| class="wikitable" style="text-align:center" border="1"
|-
| rowspan="2"|Type || colspan="3" align="center"|Site || rowspan="2"|Totals
|-
| Head and Neck || Trunk || Extremities
|-
| Hutchinson's melanomic freckle || 22 || 2 || 10 || 34
|-
| Superficial || 16 || 54 || 115 || 185
|-
| Nodular || 19 || 33 || 73 || 125
|-
| Indeterminant || 11 || 17 || 28 || 56
|-
| Column Totals || 68 || 106 || 226 || 400
|}
</center>
 
The sites (locations) of the cancer may be independent, but there may be positive dependencies of the type of cancer for a given location (site). For example, localized exposure to radiation implies that elevated level of one type of cancer (at a given location) may indicate higher level of another cancer type at the same location. The Negative Multinomial distribution may be used to model the sites cancer rates and help measure some of the cancer type dependencies within each location.
 
If <math>x_{i,j}</math> denote the cancer rates for each site (<math>0\leq i \leq 2</math>) and each type of cancer (<math>0\leq j \leq 3</math>), for a fixed site (<math>i_0</math>) the cancer rates are independent Negative Multinomial distributed random variables. That is, for each column index (site) the column-vector X has the following distribution:
: <math>X=\{X_1, X_2, X_3\} \sim NM(k_0,\{p_1,p_2,p_3\})</math>.
Different columns in the table (sites) are considered to be different instances of the random multinomially distributed vector, X. Then we have the following estimates of expected counts (frequencies of cancer):
: <math>\hat{\mu}_{i,j} = \frac{x_{i,.}\times x_{.,j}}{x_{.,.}}</math>
: <math>x_{i,.} = \sum_{j=0}^{3}{x_{i,j}}</math>
: <math>x_{.,j} = \sum_{i=0}^{2}{x_{i,j}}</math>
: <math>x_{.,.} = \sum_{i=0}^{2}\sum_{j=0}^{3}{{x_{i,j}}}</math>
: Example: <math>\hat{\mu}_{1,1} = \frac{x_{1,.}\times x_{.,1}}{x_{.,.}}=\frac{34\times 68}{400}=5.78</math>
 
For the first site (Head and Neck, j=0), suppose that <math>X=\left \{X_1=5, X_2=1, X_3=5\right \}</math> and <math>X \sim NM(k_0=10, \{p_1=0.2, p_2=0.1, p_3=0.2 \})</math>. Then:
: <math>p_0 = 1 - \sum_{i=1}^3{p_i}=0.5</math>
: <math>NM(X|k_0,\{p_1, p_2, p_3\})= 0.00465585119998784 </math>
: <math>cov[X_1,X_3] = \frac{10 \times 0.2 \times 0.2}{0.5^2}=1.6</math>
: <math>\mu_2=\frac{k_0 p_2}{p_0} = \frac{10\times 0.1}{0.5}=2.0</math>
: <math>\mu_3=\frac{k_0 p_3}{p_0} = \frac{10\times 0.2}{0.5}=4.0</math>
: <math>corr[X_2,X_3] = \left (\frac{\mu_2 \times \mu_3}{(k_0+\mu_2)(k_0+\mu_3)} \right )^{\frac{1}{2}}</math> and therefore, <math>corr[X_2,X_3] = \left (\frac{2 \times 4}{(10+2)(10+4)} \right )^{\frac{1}{2}} = 0.21821789023599242. </math>
 
Notice that the pair-wise NM correlations are always positive, whereas the correlations between [[Multinomial distribution|multinomial counts]] are always negative. As the parameter <math>k_0</math> increases, the paired correlations tend to zero! Thus, for large <math>k_0</math>, the Negative Multinomial counts <math>X_i</math> behave as ''independent'' [[Poisson distribution|Poisson random variables]] with respect to their means <math>\left ( \mu_i= k_0\frac{p_i}{p_0}\right )</math>.
 
The [[marginal distribution]] of each of the <math>X_i</math> variables is [[negative binomial]], as the <math>X_i</math> count (considered as success) is measured against all the other outcomes (failure). But jointly, the distribution of <math>X=\{X_1,\cdots,X_m\}</math> is negative multinomial, i.e., <math>X \sim NM(k_0,\{p_1,\cdots,p_m\})</math> .
 
==Parameter estimation==
* Estimation of the mean (expected) frequency counts (<math>\mu_j</math>) of each outcome (<math>X_j</math>) using maximum likelihood is possible. If we have a single observation vector <math>\{x_1, \cdots,x_m\}</math>, then <math>\hat{\mu}_i=x_i.</math> If we have several observation vectors, like in this case we have the cancer type frequencies for 3 different sites, then the MLE estimates of the mean counts are <math>\hat{\mu}_j=\frac{x_{j,.}}{I}</math>, where <math>0\leq j \leq J</math> is the cancer-type index and the summation is over the number of observed (sampled) vectors (I). For the cancer data above, we have the following MLE estimates for the expectations for the frequency counts:
::: Hutchinson's melanomic freckle type of cancer (<math>X_0</math>) is <math>\hat{\mu}_0 = 34/3=11.33</math>.
::: Superficial type of cancer (<math>X_1</math>) is <math>\hat{\mu}_1 = 185/3=61.67</math>.
::: Nodular type of cancer (<math>X_2</math>) is <math>\hat{\mu}_2 = 125/3=41.67</math>.
::: Indeterminant type of cancer (<math>X_3</math>) is <math>\hat{\mu}_3 = 56/3=18.67</math>.
 
* There is no [[Maximum likelihood estimation|MLE estimate]] for the NM <math>k_0</math> parameter.<ref name="LeGall" /><ref>{{Cite book| last = Zelterman| first = Daniel |  title = Advanced log-linear models using SAS | publisher = SAS Publishing | year = 2002 | isbn = 978-1-59047-080-0 | page=196 }}</ref> However, there are approximate protocols for estimating the <math>k_0</math> parameter using the [[Chi square distribution|chi-squared goodness of fit statistic]]. In the usual chi-squared statistic:
: <math>\Chi^2 = \sum_i{\frac{(x_i-\mu_i)^2}{\mu_i}}</math>, we can replace the expected-means (<math>\mu_i</math>) by their estimates, <math>\hat{\mu_i}</math>, and replace denominators by the corresponding negative multinomial variances. Then we get the following test statistic for negative multinomial distributed data:
: <math>\Chi^2(k_0) = \sum_{i}{\frac{(x_i-\hat{\mu_i})^2}{\hat{\mu_i} \left (1+ \frac{\hat{\mu_i}}{k_0} \right )}}</math>.
 
: Next, we can estimate the <math>k_0</math> parameter by varying the values of <math>k_0</math> in the expression <math>\Chi^2(k_0)</math> and matching the values of this statistic with the corresponding asymptotic chi-squared distribution. The following protocol summarizes these steps using the cancer data above.
:: ''DF'': The [[Chi-squared distribution|degree of freedom for the Chi-squared distribution]] in this case is:
::: df = (# rows – 1)(# columns – 1) = (3-1)*(4-1) = 6
 
:: ''Median'': The median of a chi-squared random variable with 6 df is 5.261948.
 
:: ''Mean Counts Estimates'': The mean counts estimates (<math>\mu_j</math>) for the 4 different cancer types are:
:::<math>\hat{\mu}_1 = 185/3=61.67</math>; <math>\hat{\mu}_2 = 125/3=41.67</math>; and <math>\hat{\mu}_3 = 56/3=18.67</math>.
 
: Thus, we can solve the equation above <math>\Chi^2(k_0) = 5.261948</math> for the single variable of interest -- the unknown parameter <math>k_0</math>. In the cancer example, suppose <math>x=\{x_1=5,x_2=1,x_3=5\}</math>. Then, the solution is an asymptotic chi-squared distribution driven estimate of the parameter <math>k_0</math>.
: <math>\Chi^2(k_0) = \sum_{i=1}^3{\frac{(x_i-\hat{\mu_i})^2}{\hat{\mu_i} \left (1+ \frac{\hat{\mu_i}}{k_0} \right )}}</math>.
: <math>\Chi^2(k_0) = \frac{(5-61.67)^2}{61.67(1+61.67/k_0)}+\frac{(1-41.67)^2}{41.67(1+41.67/k_0)}+\frac{(5-18.67)^2}{18.67(1+18.67/k_0)}=5.261948.</math> Solving this equation for <math>k_0</math> provides the desired estimate for the last parameter.
:: [[Mathematica]] provides 3 distinct (<math>k_0</math>) solutions to this equation: {'''50.5466''', -21.5204, '''2.40461'''}. Since <math>k_0>0</math> there are 2 candidate solutions.
 
* Estimates of probabilities: Assume <math>k_0=2</math> and <math>\frac{\mu_i}{k_0}p_0=p_i</math>, then:
: <math>\frac{61.67}{k_0}p_0=31p_0=p_1</math>
: <math>20p_0=p_2</math>
: <math>9p_0=p_3</math>
: Hence, <math>1-p_0=p_1+p_2+p_3=60p_0</math>, and <math>p_0=\frac{1}{61}</math>, <math>p_1=\frac{31}{61}</math>, <math>p_2=\frac{20}{61}</math> and <math>p_3=\frac{9}{61}</math>.
: Therefore, the best model distribution for the observed sample <math>x=\{x_1=5,x_2=1,x_3=5\}</math> is <math>X \sim NM\left (2, \left \{\frac{31}{61}, \frac{20}{61},\frac{9}{61}\right\} \right ).</math>
 
==Related distributions==
* [[Negative binomial distribution]]
* [[Multinomial distribution]]
 
==References==
<references />
 
==Further reading==
{{cite book|last1=Johnson|first1= Norman L.| last2=Kotz|first2=Samuel| last3= Balakrishnan|first3=N.|
title=Discrete Multivariate Distributions|chapter=Chapter 36: Negative Multinomial and Other Multinomial-Related Distributions|
year=1997|publisher=Wiley|isbn=0-471-12844-9}}
 
{{ProbDistributions|multivariate}}
{{Use dmy dates|date=September 2010}}
 
{{DEFAULTSORT:Negative Multinomial Distribution}}
[[Category:Factorial and binomial topics]]
[[Category:Multivariate discrete distributions]]
[[Category:Probability distributions]]

Latest revision as of 15:44, 30 January 2013

Template:Cleanup Template:Probability distribution

In probability theory and statistics, the negative multinomial distribution is a generalization of the negative binomial distribution (NB(r, p)) to more than two outcomes.[1]

Suppose we have an experiment that generates m+1≥2 possible outcomes, {X0,…,Xm}, each occurring with non-negative probabilities {p0,…,pm} respectively. If sampling proceeded until n observations were made, then {X0,…,Xm} would have been multinomially distributed. However, if the experiment is stopped once X0 reaches the predetermined value k0, then the distribution of the m-tuple {X1,…,Xm} is negative multinomial.

Negative multinomial distribution example

The table below shows the an example of 400 Melanoma (skin cancer) Patients where the Type and Site of the cancer are recorded for each subject.

Type Site Totals
Head and Neck Trunk Extremities
Hutchinson's melanomic freckle 22 2 10 34
Superficial 16 54 115 185
Nodular 19 33 73 125
Indeterminant 11 17 28 56
Column Totals 68 106 226 400

The sites (locations) of the cancer may be independent, but there may be positive dependencies of the type of cancer for a given location (site). For example, localized exposure to radiation implies that elevated level of one type of cancer (at a given location) may indicate higher level of another cancer type at the same location. The Negative Multinomial distribution may be used to model the sites cancer rates and help measure some of the cancer type dependencies within each location.

If xi,j denote the cancer rates for each site (0i2) and each type of cancer (0j3), for a fixed site (i0) the cancer rates are independent Negative Multinomial distributed random variables. That is, for each column index (site) the column-vector X has the following distribution:

X={X1,X2,X3}NM(k0,{p1,p2,p3}).

Different columns in the table (sites) are considered to be different instances of the random multinomially distributed vector, X. Then we have the following estimates of expected counts (frequencies of cancer):

μ^i,j=xi,.×x.,jx.,.
xi,.=j=03xi,j
x.,j=i=02xi,j
x.,.=i=02j=03xi,j
Example: μ^1,1=x1,.×x.,1x.,.=34×68400=5.78

For the first site (Head and Neck, j=0), suppose that X={X1=5,X2=1,X3=5} and XNM(k0=10,{p1=0.2,p2=0.1,p3=0.2}). Then:

p0=1i=13pi=0.5
NM(X|k0,{p1,p2,p3})=0.00465585119998784
cov[X1,X3]=10×0.2×0.20.52=1.6
μ2=k0p2p0=10×0.10.5=2.0
μ3=k0p3p0=10×0.20.5=4.0
corr[X2,X3]=(μ2×μ3(k0+μ2)(k0+μ3))12 and therefore, corr[X2,X3]=(2×4(10+2)(10+4))12=0.21821789023599242.

Notice that the pair-wise NM correlations are always positive, whereas the correlations between multinomial counts are always negative. As the parameter k0 increases, the paired correlations tend to zero! Thus, for large k0, the Negative Multinomial counts Xi behave as independent Poisson random variables with respect to their means (μi=k0pip0).

The marginal distribution of each of the Xi variables is negative binomial, as the Xi count (considered as success) is measured against all the other outcomes (failure). But jointly, the distribution of X={X1,,Xm} is negative multinomial, i.e., XNM(k0,{p1,,pm}) .

Parameter estimation

  • Estimation of the mean (expected) frequency counts (μj) of each outcome (Xj) using maximum likelihood is possible. If we have a single observation vector {x1,,xm}, then μ^i=xi. If we have several observation vectors, like in this case we have the cancer type frequencies for 3 different sites, then the MLE estimates of the mean counts are μ^j=xj,.I, where 0jJ is the cancer-type index and the summation is over the number of observed (sampled) vectors (I). For the cancer data above, we have the following MLE estimates for the expectations for the frequency counts:
Hutchinson's melanomic freckle type of cancer (X0) is μ^0=34/3=11.33.
Superficial type of cancer (X1) is μ^1=185/3=61.67.
Nodular type of cancer (X2) is μ^2=125/3=41.67.
Indeterminant type of cancer (X3) is μ^3=56/3=18.67.
X2=i(xiμi)2μi, we can replace the expected-means (μi) by their estimates, μi^, and replace denominators by the corresponding negative multinomial variances. Then we get the following test statistic for negative multinomial distributed data:
X2(k0)=i(xiμi^)2μi^(1+μi^k0).
Next, we can estimate the k0 parameter by varying the values of k0 in the expression X2(k0) and matching the values of this statistic with the corresponding asymptotic chi-squared distribution. The following protocol summarizes these steps using the cancer data above.
DF: The degree of freedom for the Chi-squared distribution in this case is:
df = (# rows – 1)(# columns – 1) = (3-1)*(4-1) = 6
Median: The median of a chi-squared random variable with 6 df is 5.261948.
Mean Counts Estimates: The mean counts estimates (μj) for the 4 different cancer types are:
μ^1=185/3=61.67; μ^2=125/3=41.67; and μ^3=56/3=18.67.
Thus, we can solve the equation above X2(k0)=5.261948 for the single variable of interest -- the unknown parameter k0. In the cancer example, suppose x={x1=5,x2=1,x3=5}. Then, the solution is an asymptotic chi-squared distribution driven estimate of the parameter k0.
X2(k0)=i=13(xiμi^)2μi^(1+μi^k0).
X2(k0)=(561.67)261.67(1+61.67/k0)+(141.67)241.67(1+41.67/k0)+(518.67)218.67(1+18.67/k0)=5.261948. Solving this equation for k0 provides the desired estimate for the last parameter.
Mathematica provides 3 distinct (k0) solutions to this equation: {50.5466, -21.5204, 2.40461}. Since k0>0 there are 2 candidate solutions.
61.67k0p0=31p0=p1
20p0=p2
9p0=p3
Hence, 1p0=p1+p2+p3=60p0, and p0=161, p1=3161, p2=2061 and p3=961.
Therefore, the best model distribution for the observed sample x={x1=5,x2=1,x3=5} is XNM(2,{3161,2061,961}).

Related distributions

References

  1. 1.0 1.1 Le Gall, F. The modes of a negative multinomial distribution, Statistics & Probability Letters, Volume 76, Issue 6, 15 March 2006, Pages 619-624, ISSN 0167-7152, 10.1016/j.spl.2005.09.009.
  2. 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

    My blog: http://www.primaboinca.com/view_profile.php?userid=5889534

Further reading

20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534

55 yrs old Metal Polisher Records from Gypsumville, has interests which include owning an antique car, summoners war hack and spelunkering. Gets immense motivation from life by going to places such as Villa Adriana (Tivoli).

my web site - summoners war hack no survey ios 30 year-old Entertainer or Range Artist Wesley from Drumheller, really loves vehicle, property developers properties for sale in singapore singapore and horse racing. Finds inspiration by traveling to Works of Antoni Gaudí.