Familywise error rate: Difference between revisions

Latest revision as of 03:54, 27 September 2014

Hi there, I am Sophia. Alaska is exactly where he's usually been residing. Invoicing is my profession. It's not a common thing but what she likes doing is to play domino but she doesn't have the time recently.

My web site - tarot card readings (find more info)

@@ Line 1: / Line 1: @@
-{{For|the video game|Perplexity (video game)}}{{For|the card/alternative reality game|Perplex City}}
+Hi there, I am Sophia. Alaska is exactly where he's usually been residing. Invoicing is my profession. It's not a common thing but what she likes doing is to play domino but she doesn't have the time recently.<br><br>My web site - tarot card readings ([http://fashionlinked.com/index.php?do=/profile-13453/info/ find more info])
-{{Wiktionarypar|perplexity}}
-In [[information theory]], '''perplexity''' is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models.
-== Perplexity of a probability distribution ==
-The perplexity of a discrete [[probability distribution]] ''p'' is defined as
-:<math>2^{H(p)}=2^{-\sum_x p(x)\log_2 p(x)}</math>
-where ''H''(''p'') is the entropy of the distribution and ''x'' ranges over events.
-Perplexity of a [[random variable]] ''X'' may be defined as the perplexity of the distribution over its possible values ''x''.
-In the special case where ''p'' models a fair ''k''-sided die (a uniform distribution over ''k'' discrete events), its perplexity is ''k''.   A random variable with perplexity ''k'' has the same uncertainty as a fair ''k''-sided die, and one is said to be "''k''-ways perplexed" about the value of the random variable.  (Unless it is a fair ''k''-sided die, more than ''k'' values will be possible, but the overall uncertainty is no greater because some of these values will have probability greater than 1/''k'', decreasing the overall value while summing.)
-== Perplexity of a probability model ==
-A model of an unknown probability distribution ''p'', may be proposed based on a training sample that was drawn from ''p''.  Given a proposed probability model ''q'', one may evaluate ''q'' by asking how well it predicts a separate test sample ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x<sub>N</sub>'' also drawn from ''p''.  The perplexity of the model ''q'' is defined as
-:<math>2^{-\sum_{i=1}^N \frac{1}{N} \log_2 q(x_i)}</math>
-Better models ''q'' of the unknown distribution ''p'' will tend to assign higher probabilities ''q''(''x<sub>i</sub>'') to the test events.  Thus, they have lower perplexity: they are less surprised by the test sample.
-The exponent above may be regarded as the average number of bits needed to represent a test event ''x<sub>i</sub>'' if one uses an optimal code based on ''q''.  Low-perplexity models do a better job of compressing the test sample, requiring few bits per test element on average because ''q''(''x<sub>i</sub>'') tends to be high.
-The exponent may also be regarded as a [[cross-entropy]],
-:<math>H(\tilde{p},q) = -\sum_x \tilde{p}(x) \log_2 q(x)</math>
-where <math>\tilde{p}</math> denotes the empirical distribution of the test sample (i.e., <math>\tilde{p}(x) = n/N</math> if ''x'' appeared ''n'' times in the test sample of size ''N'').
-== Perplexity per word ==
-In [[natural language processing]], perplexity is a way of evaluating [[language model]]s.  A language model is a probability distribution over entire sentences or texts.
-Using the definition of perplexity for a probability model, one might find, for example, that the average sentence ''x<sub>i</sub>'' in the test sample could be coded in 190 bits (i.e., the test sentences had an average log-probability of -190).  This would give an enormous model perplexity of 2<sup>190</sup> per sentence.  However, it is more common to normalize for sentence length and consider only the number of bits per word.  Thus, if the test sample's sentences comprised a total of 1,000 words, and could be coded using a total of 7,950 bits, one could report a model perplexity of 2<sup>7.95</sup> = 247  ''per word.''  In other words, the model is as confused on test data as if it had to choose uniformly and independently among 247 possibilities for each word.
-The lowest perplexity that has been published on the [[Brown Corpus]] (1 million words of American [[English language|English]] of varying topics and genres) as of 1992 is indeed about 247 per word, corresponding to a cross-entropy of log<sub>2</sub>247 = 7.95 bits per word or 1.75 bits per letter <ref>{{cite journal |last=Brown |first=Peter F. |authorlink= |coauthors=et al.|date=March 1992 |title= An Estimate of an Upper Bound for the Entropy of English|journal=Computational Linguistics |volume=18 |issue=1 |pages= |id= |url=http://acl.ldc.upenn.edu/J/J92/J92-1002.pdf |accessdate=2007-02-07}}</ref> using a [[N-gram|trigram]] model.  It is often possible to achieve lower perplexity on more specialized [[text corpus|corpora]], as they are more predictable.
-==References==
-<references />
-[[Category:Entropy and information]]

Familywise error rate: Difference between revisions

Latest revision as of 03:54, 27 September 2014

Navigation menu

Search