Magic series: Difference between revisions
en>Luckas-bot m r2.7.1) (Robot: Adding ta:மாயத் தொடர் |
en>Mogism m Cleanup/Typo fixing, typos fixed: a exact → an exact (4) using AWB |
||
Line 1: | Line 1: | ||
'''Winsorising''' or '''Winsorization''' (this is also sometimes called Georgization{{citation needed|date=November 2013}}) is the transformation of [[statistic]]s by limiting [[extreme value]]s in the [[statistics|statistical]] data to reduce the effect of possibly spurious [[outliers]]. It is named after the engineer-turned-biostatistician [[Charles P. Winsor]] (1895–1951). The effect is the same as [[clipping (signal processing)|clipping]] in signal processing. | |||
The distribution of many [[statistic]]s can be heavily influenced by [[outlier]]s. A typical strategy is to set all outliers to a specified [[percentile]] of the data; for example, a 90% Winsorisation would see all data below the 5th percentile set to the 5th percentile, and data above the 95th percentile set to the 95th percentile. | |||
Winsorised [[estimator]]s are usually more [[robust statistics|robust]] to outliers than their more standard forms, although there are alternatives, such as [[Trimmed estimator|trimming]], that will achieve a similar effect. | |||
== Example == | |||
Consider the data set consisting of: | |||
:<math>\{92, 19, \mathbf{101}, 58, \mathbf{153}, 91, 26, 78, 10, 13, \mathbf{-40}, \mathbf{101}, 86, 85, 15, 89, 89, 25, \mathbf{2}, 41\} \qquad (N = 20)</math> | |||
The 5th percentile lies between -40 and 2, while the 95th percentile lies between 101 and 153. (Values shown in bold.) | |||
Then a 90% Winsorisation would result in the following: | |||
:<math>\{92, 19, \mathbf{101}, 58, \mathbf{101}, 91, 26, 78, 10, 13, \mathbf{2}, \mathbf{101}, 86, 85, 15, 89, 89, 25, \mathbf{2}, 41\} \qquad (N = 20)</math> | |||
== Distinction from trimming == | |||
Note that Winsorizing is not equivalent to simply excluding data, which is a simpler procedure, called [[trimmed estimator|trimming]] or [[Truncation (statistics)|truncation]], but is a method of [[Censoring (statistics)|censoring]] data. | |||
In a trimmed estimator, the extreme values are ''discarded;'' in a Winsorized estimator, the extreme values are instead ''replaced'' by certain percentiles (the trimmed minimum and maximum). | |||
Thus a [[Winsorized mean]] is not the same as a [[truncated mean]]. | |||
For instance, the 10% trimmed mean is the average of the 5th to 95th percentile of the data, while the 90% Winsorised mean sets the bottom 5% to the 5th percentile, the top 5% to the 95th percentile, and then averages the data. In the previous example the trimmed mean would be obtained from the smaller set: | |||
:<math>\{92, 19, \mathbf{101}, 58, \quad 91, 26, 78, 10, 13, \quad \mathbf{101}, 86, 85, 15, 89, 89, 25, \mathbf{2}, 41\} \qquad (N = 18)</math> | |||
More formally, they are distinct because the [[order statistics]] are not independent. | |||
== References == | |||
* Hasings, C., Mosteller, F., Tukey, J.W., Winsor, C.P. (1947) ''Low moments for small samples: a comparative study of order statistics'', [[Annals of Mathematical Statistics]], 18, 413–426. | |||
* W. J. Dixon (1960). ''Simplified Estimation from Censored Normal Samples'', The Annals of Mathematical Statistics, 31, 385–391. | |||
* [[John Tukey|J. W. Tukey]] (1962) ''The Future of Data Analysis'', The Annals of Mathematical Statistics, 33, p. 18 | |||
[[Category:Statistical theory]] | |||
[[Category:Robust statistics]] | |||
{{Statistics-stub}} |
Revision as of 19:24, 23 July 2013
Winsorising or Winsorization (this is also sometimes called GeorgizationPotter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.) is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor (1895–1951). The effect is the same as clipping in signal processing.
The distribution of many statistics can be heavily influenced by outliers. A typical strategy is to set all outliers to a specified percentile of the data; for example, a 90% Winsorisation would see all data below the 5th percentile set to the 5th percentile, and data above the 95th percentile set to the 95th percentile. Winsorised estimators are usually more robust to outliers than their more standard forms, although there are alternatives, such as trimming, that will achieve a similar effect.
Example
Consider the data set consisting of:
The 5th percentile lies between -40 and 2, while the 95th percentile lies between 101 and 153. (Values shown in bold.) Then a 90% Winsorisation would result in the following:
Distinction from trimming
Note that Winsorizing is not equivalent to simply excluding data, which is a simpler procedure, called trimming or truncation, but is a method of censoring data.
In a trimmed estimator, the extreme values are discarded; in a Winsorized estimator, the extreme values are instead replaced by certain percentiles (the trimmed minimum and maximum).
Thus a Winsorized mean is not the same as a truncated mean. For instance, the 10% trimmed mean is the average of the 5th to 95th percentile of the data, while the 90% Winsorised mean sets the bottom 5% to the 5th percentile, the top 5% to the 95th percentile, and then averages the data. In the previous example the trimmed mean would be obtained from the smaller set:
More formally, they are distinct because the order statistics are not independent.
References
- Hasings, C., Mosteller, F., Tukey, J.W., Winsor, C.P. (1947) Low moments for small samples: a comparative study of order statistics, Annals of Mathematical Statistics, 18, 413–426.
- W. J. Dixon (1960). Simplified Estimation from Censored Normal Samples, The Annals of Mathematical Statistics, 31, 385–391.
- J. W. Tukey (1962) The Future of Data Analysis, The Annals of Mathematical Statistics, 33, p. 18
I am Chester from Den Haag. I am learning to play the Cello. Other hobbies are Running.
Also visit my website: Hostgator Coupons - dawonls.dothome.co.kr -