Increasing process: Difference between revisions
en>Melcombe non-integer index |
en>Addbot m Bot: Migrating 1 interwiki links, now provided by Wikidata on d:q6015335 |
||
Line 1: | Line 1: | ||
{{Other uses|Resampling (disambiguation){{!}}Resampling}} | |||
In [[statistics]], '''resampling''' is any of a variety of methods for doing one of the following: | |||
# Estimating the precision of sample [[statistic]]s ([[median]]s, [[variance]]s, [[percentile]]s) by using subsets of available data ('''[[Jackknife (statistics)|jackknifing]]''') or drawing [[random]]ly with replacement from a set of data points ('''[[bootstrapping (statistics)|bootstrapping]]''') | |||
# Exchanging labels on data points when performing [[significance test]]s ('''permutation tests''', also called [[exact test]]s, randomization tests, or re-randomization tests) | |||
# Validating models by using random subsets (bootstrapping, [[Cross-validation (statistics)|cross validation]]) | |||
Common resampling techniques include bootstrapping, jackknifing and permutation tests. | |||
==Bootstrap== | |||
{{main|Bootstrap (statistics)}} | |||
Bootstrapping is a statistical method for estimating the [[sampling distribution]] of an [[estimator]] by [[sampling (statistics)|sampling]] with replacement from the original sample, most often with the purpose of deriving robust estimates of [[standard error]]s and [[confidence intervals]] of a population parameter like a [[mean]], [[median]], [[Proportionality (mathematics)|proportion]], [[odds ratio]], [[Pearson product-moment correlation coefficient|correlation coefficient]] or [[Regression analysis|regression]] coefficient. It may also be used for constructing hypothesis tests. It is often used as a robust alternative to inference based on parametric assumptions when those assumptions are in doubt, or where parametric inference is impossible or requires very complicated formulas for the calculation of standard errors. | |||
==Jackknife== | |||
{{main|Jackknife (statistics)}} | |||
Jackknifing, which is similar to bootstrapping, is used in [[statistical inference]] to estimate the bias and standard error (variance) of a statistic, when a random sample of observations is used to calculate it. Historically this method preceded the invention of the bootstrap with Quenouille inventing this method in 1949 and Tukey extending it in 1958.<ref name=Quenouille1949>Quenouille M (1949) Approximate tests of correlation in time series. J Roy Stat Soc Series B 11: 68-84</ref><ref name=Tukey1958>Tukey JW (1958) Bias and confidence in not quite large samples (abstract). Ann Math Stats 29: 614</ref> This method was foreshadowed by Mahalanobis who in 1946 suggested repeated estimates of the statistic of interest with half the sample chosen at random. <ref name=Mahalanobis1946>Mahalanobis PC (1946). Recent experiments in statistical sampling in the Indian Statistical Institute. J Roy Stat Soc 109: 325-370</ref> He coined the name 'interpenetrating samples' for this method. | |||
Quenouille invented this method with the intention of reducing the bias of the sample estimate. Tukey extended this method by assuming that if the replicates could be considered identically and independently distributed, then an estimate of the variance of the sample parameter could be made and that it would be approximately distributed as a t variate with ''n'' - 1 degrees of freedom (''n'' being the sample size). | |||
The basic idea behind the jackknife variance estimator lies in systematically recomputing the statistic estimate, leaving out one or more observations at a time from the sample set. From this new set of replicates of the statistic, an estimate for the bias and an estimate for the variance of the statistic can be calculated. | |||
Instead of using the jackknife to estimate the variance, it may instead be applied to the log of the variance. This transformation may result in better estimates particularly when the distribution of the variance itself may be non normal. | |||
For many statistical parameters the jackknife estimate of variance tends asymptotically to the true value almost surely. In technical terms one says that the jackknife estimate is [[Consistency (statistics)|consistent]]. The jackknife is consistent for the sample [[mean]]s, sample [[variance]]s, central and non-central t-statistics (with possibly non-normal populations), sample [[coefficient of variation]], [[maximum likelihood estimator]]s, least squares estimators, [[correlation coefficient]]s and [[regression coefficient]]s. | |||
It is not consistent for the sample [[median]]. In the case of a unimodal variate the ratio of the jackknife variance to the sample variance tends to be distributed as one half the square of a chi square distribution with two [[degrees of freedom]]. | |||
The jackknife, like the original bootstrap, is dependent on the independence of the data. Extensions of the jackknife to allow for dependence in the data have been proposed. | |||
Another extension is the delete a group method used in association with [[Poisson sampling]]. | |||
==Comparison of Bootstrap and Jackknife== | |||
Both methods, the bootstrap and the jackknife, estimate the variability of a statistic from the variability of that statistic between subsamples, rather than from parametric assumptions. For the more general jackknife, the delete-m observations jackknife, the bootstrap can be seen as a random approximation of it. Both yield similar numerical results, which is why each can be seen as approximation to the other. Although there are huge theoretical differences in their mathematical insights, the main practical difference for statistics users is that the [[Bootstrapping (statistics)|bootstrap]] gives different results when repeated on the same data, whereas the jackknife gives exactly the same result each time. Because of this, the jackknife is popular when the estimates need to be verified several times before publishing (e.g., official statistics agencies). On the other hand, when this verification feature is not crucial and it is of interest not to have a number but just an idea of its distribution, the bootstrap is preferred (e.g., studies in physics, economics, biological sciences). | |||
Whether to use the bootstrap or the jackknife may depend more on operational aspects than on statistical concerns of a survey. The jackknife, originally used for bias reduction, is more of a specialized method and only estimates the variance of the point estimator. This can be enough for basic statistical inference (e.g., hypothesis testing, confidence intervals). The bootstrap, on the other hand, first estimates the whole distribution (of the point estimator) and then computes the variance from that. While powerful and easy, this can become highly computer intensive. | |||
"The bootstrap can be applied to both variance and distribution estimation problems. However, the bootstrap variance estimator is not as good as the jackknife or the [[balanced repeated replication]] (BRR) variance estimator in terms of the empirical results. Furthermore, the bootstrap variance estimator usually requires more computations than the jackknife or the BRR. Thus, the bootstrap is mainly recommended for distribution estimation." <ref>Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc. pp. 281.</ref> | |||
There is a special consideration with the jackknife, particularly with the delete-1 observation jackknife. It should only be used with smooth, differentiable statistics (e.g., totals, means, proportions, ratios, odd ratios, regression coefficients, etc.; not with medians or quantiles). This may become a practical disadvantage (or not, depending on the needs of the user). This disadvantage is usually the argument favoring bootstrapping over jackknifing. More general jackknifes than the delete-1, such as the delete-m jackknife, overcome this problem for the medians and quantiles by relaxing the smoothness requirements for consistent variance estimation. | |||
Usually the jackknife is easier to apply to complex sampling schemes than the bootstrap. Complex sampling schemes may involve stratification, multiple stages (clustering), varying sampling weights (non-response adjustments, calibration, post-stratification) and under unequal-probability sampling designs. Theoretical aspects of both the bootstrap and the jackknife can be found in Shao and Tu (1995),<ref>Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc.</ref> whereas a basic introduction is accounted in Wolter (2007).<ref>Wolter, K.M. (2007). Introduction to Variance Estimation. Second Edition. Springer, Inc.</ref> | |||
==Cross-validation== | |||
{{main|Cross-validation (statistics)}} | |||
Cross-validation is a statistical method for validating a predictive model. Subsets of the data are held out for use as validating sets; a model is fit to the remaining data (a training set) and used to predict for the validation set. Averaging the quality of the predictions across the validation sets yields an overall measure of prediction accuracy. | |||
One form of cross-validation leaves out a single observation at a time; this is similar to the jackknife. Another, K-fold cross-validation, splits the data into K subsets; each is held out in turn as the validation set. | |||
This avoids "self-influence". For comparison, in [[regression analysis]] methods such as [[linear regression]], each y value draws the regression line toward itself, making the prediction of that value appear more accurate than it really is. Cross-validation applied to linear regression predicts the y value for each observation without using that observation. | |||
This is often used for deciding how many predictor variables to use in regression. Without cross-validation, adding predictors always reduces the residual sum of squares (or possibly leaves it unchanged). In contrast, the cross-validated mean-square error will tend to decrease if valuable predictors are added, but increase if worthless predictors are added.{{Citation needed|date=January 2009}} | |||
==Permutation tests== | |||
<!-- [[Permutation test]] redirects to this section name --> | |||
{{main|Exact test}} | |||
A '''permutation test''' (also called a randomization test, re-randomization test, or an [[exact test]]) is a type of [[statistical hypothesis testing|statistical significance test]] in which the distribution of the test statistic under the null hypothesis is obtained by calculating all possible values of the [[test statistic]] under rearrangements of the labels on the observed data points. In other words, the method by which treatments are allocated to subjects in an experimental design is mirrored in the analysis of that design. If the labels are exchangeable under the null hypothesis, then the resulting tests yield exact significance levels; see also [[exchangeability]]. Confidence intervals can then be derived from the tests. The theory has evolved from the works of [[R.A. Fisher]] and [[E.J.G. Pitman]] in the 1930s. | |||
To illustrate the basic idea of a permutation test, | |||
suppose we have two groups <math>A</math> and <math>B</math> whose sample means | |||
are <math>\bar{x}_{A}</math> | |||
and <math>\bar{x}_{B}</math>, | |||
and that we want to test, at 5% significance level, whether they come from the same distribution. | |||
Let <math>n_{A}</math> and <math>n_{B}</math> be the sample | |||
size corresponding to each group. | |||
The permutation test is designed to | |||
determine whether the observed difference | |||
between the sample means is large enough | |||
to reject the null hypothesis H<math>_{0}</math> that | |||
the two groups have identical probability distribution. | |||
The test proceeds as follows. | |||
First, the difference in means between the two samples is calculated: this is the observed value of the test statistic, T(obs). Then the observations of groups <math>A</math> and <math>B</math> are pooled. | |||
Next, the difference in sample means is calculated and recorded for every possible way of dividing these pooled values into two groups of size <math>n_{A}</math> and <math>n_{B}</math> (i.e., for every permutation of the group labels A and B). The set of these calculated differences is the exact distribution of possible differences under the null hypothesis that group label does not matter. | |||
The one-sided p-value of the test is calculated as the proportion of sampled permutations where the difference in means was greater than or equal to T(obs). | |||
The two-sided p-value of the test is calculated as the proportion of sampled permutations where the [[absolute difference]] was greater than or equal to ABS(T(obs)). | |||
If the only purpose of the test is reject or not reject the null hypothesis, we can as an alternative sort the recorded differences, and then observe if T(obs) is contained within the middle 95% of them. If it is not, we reject the hypothesis of identical probability curves at the 5% significance level. | |||
===Relation to parametric tests=== | |||
Permutation tests are a subset of [[non-parametric statistics]]. The basic premise is to use only the assumption that it is possible that all of the treatment groups are equivalent, and that every member of them is the same before sampling began (i.e. the slot that they fill is not differentiable from other slots before the slots are filled). From this, one can calculate a statistic and then see to what extent this statistic is special by seeing how likely it would be if the treatment assignments had been jumbled. | |||
In contrast to permutation tests, the reference distributions for many popular [[classical statistics|"classical" statistical]] tests, such as the [[t-test]], [[F-test]], [[z-test]] and [[chi-squared test|''χ''<sup>2</sup> test]], are obtained from theoretical probability distributions. | |||
[[Fisher's exact test]] is an example of a commonly used permutation test for evaluating the association between two dichotomous variables. When sample sizes are large, the Pearson's chi-square test will give accurate results. For small samples, the chi-square reference distribution cannot be assumed to give a correct description of the probability distribution of the test statistic, and in this situation the use of Fisher's exact test becomes more appropriate. A rule of thumb is that the expected count in each cell of the table should be greater than 5 before Pearson's chi-squared test is used.{{Citation needed|date=August 2011}} | |||
Permutation tests exist in many situations where parametric tests do not (e.g., when deriving an optimal test when losses are proportional to the size of an error rather than its square). All simple and many relatively complex parametric tests have a corresponding permutation test version that is defined by using the same test statistic as the parametric test, but obtains the p-value from the sample-specific permutation distribution of that statistic, rather than from the theoretical distribution derived from the parametric assumption. For example, it is possible in this manner to construct a permutation [[t-test]], a permutation [[chi-squared test]] of association, a permutation version of Aly's test for comparing variances and so on. | |||
The major down-side to permutation tests are that they | |||
* Can be computationally intensive and may require "custom" code for difficult-to-calculate statistics. This must be rewritten for every case. | |||
* Are primarily used to provide a p-value. The inversion of the test to get confidence regions/intervals requires even more computation. | |||
===Advantages=== | |||
Permutation tests exist for any test statistic, regardless of whether or not its distribution is known. Thus one is always free to choose the statistic which best discriminates between hypothesis and alternative and which minimizes losses. | |||
Permutation tests can be used for analyzing unbalanced designs <ref>http://tbf.coe.wayne.edu/jmasm/vol1_no2.pdf</ref> and for combining dependent tests on mixtures of categorical, ordinal, and metric data (Pesarin, 2001). They can also be used to analyze qualitative data that has been quantitized (i.e., turned into numbers). Permutation tests may be ideal for analyzing quantitized data that do not satisfy statistical assumptions underlying traditional parametric tests (e.g., t-tests, ANOVA) (Collingridge, 2013). | |||
Before the 1980s, the burden of creating the reference distribution was overwhelming except for data sets with small sample sizes. | |||
Since the 1980s, the confluence of relatively inexpensive fast computers and the development of new sophisticated path algorithms applicable in special situations, made the application of permutation test methods practical for a wide range of problems. It also initiated the addition of exact-test options in the main statistical software packages and the appearance of specialized software for performing a wide range of uni- and multi-variable exact tests and computing test-based "exact" confidence intervals. | |||
===Limitations=== | |||
An important assumption behind a permutation test is that the observations are exchangeable under the null hypothesis. An important consequence of this assumption is that tests of difference in location (like a permutation t-test) require equal variance. In this respect, the permutation t-test shares the same weakness as the classical Student's t-test (the [[Behrens–Fisher problem]]). A third alternative in this situation is to use a bootstrap-based test. Good (2005) explains the difference between permutation tests and bootstrap tests the following way: "Permutations test hypotheses concerning distributions; bootstraps test hypotheses concerning parameters. As a result, the bootstrap entails less-stringent assumptions." Of course, bootstrap tests are not exact. | |||
===Monte Carlo testing=== | |||
An asymptotically equivalent permutation test can be created when there are too many possible orderings of the data to allow complete enumeration in a convenient manner. This is done by generating the reference distribution by [[Monte Carlo sampling]], which takes a small (relative to the total number of permutations) random sample of the possible replicates. | |||
The realization that this could be applied to any permutation test on any dataset was an important breakthrough in the area of applied statistics. The earliest known reference to this approach is Dwass (1957).<ref>[[Meyer Dwass]], "Modified Randomization Tests for Nonparametric Hypotheses", ''[[The Annals of Mathematical Statistics]]'', 28:181-187, 1957.</ref> | |||
This type of permutation test is known under various names: ''approximate permutation test'', ''Monte Carlo permutation tests'' or ''random permutation tests''.<ref>{{Cite journal | |||
| author = [[Thomas E. Nichols]], [[Andrew P. Holmes]] | |||
| url = http://www.fil.ion.ucl.ac.uk/spm/doc/papers/NicholsHolmes.pdf | |||
| title = Nonparametric Permutation Tests For Functional Neuroimaging: A Primer with Examples | |||
| journal = [[Human Brain Mapping]] | |||
| volume = 15 | |||
| pages = 1–25 | |||
| year = 2001 | |||
| doi = 10.1002/hbm.1058 | |||
| pmid = 11747097 | |||
| issue = 1 | |||
}}</ref> | |||
After <math style="position:relative; top:-.2em">\scriptstyle\ N </math> random permutations, it is possible to obtain a confidence interval for the p-value based on the Binomial distribution. For example, if after <math style="position:relative; top:-.1em">\scriptstyle\ N = 10000</math> random permutations the p-value is estimated to be <math style="position:relative; top:-.1em">\scriptstyle\ \hat{p}=0.05 </math>, then a 99% confidence interval for the true <math style="position:relative;top:.1em">\scriptstyle\ p </math> (the one that would result from trying all possible permutations) is <math style="position:relative; top:.1em">\scriptstyle\ [0.044, 0.056] </math>. | |||
On the other hand, the purpose of estimating the p-value is most often to decide whether <math style="position:relative; top:-.1em">\scriptstyle\ p \leq \alpha </math>, where <math style="position:relative; top:-.1em">\scriptstyle\ \alpha </math> is the threshold at which the null hypothesis will be rejected (typically <math style="position:relative; top:-.1em">\scriptstyle\ \alpha=0.05</math>). In the example above, the confidence interval only tells us that there is roughly a 50% chance that the p-value is smaller than 0.05, i.e. it is completely unclear whether the null hypothesis should be rejected at a level <math style="position:relative; top:-.1em">\scriptstyle\ \alpha=0.05 </math>. | |||
If it is only important to know whether <math style="position:relative; top:-.1em">\scriptstyle\ p \leq \alpha </math> for a given <math style="position:relative; top:-.1em">\scriptstyle\ \alpha</math>, it is logical to continue simulating until the statement <math style="position:relative; top:-.1em">\scriptstyle\ p \leq \alpha </math> can be established to be true or false with a very low probability of error. Given a bound <math style="position:relative; top:-.1em">\scriptstyle\ \epsilon </math> on the admissible probability of error (the probability of finding that <math style="position:relative; top:-.1em">\scriptstyle\ \hat{p} > \alpha </math> when in fact <math style="position:relative; top:-.1em">\scriptstyle\ p \leq \alpha </math> or vice versa), the question of how many permutations to generate can be seen as the question of when to stop generating permutations, based on the outcomes of the simulations so far, in order to guarantee that the conclusion (which is either <math style="position:relative; top:-.1em">\scriptstyle\ p \leq \alpha </math> or <math style="position:relative; top:-.1em">\scriptstyle\ p > \alpha </math>) is correct with probability at least as large as <math style="position:relative; top:-.1em">\scriptstyle\ 1-\epsilon </math>. (<math style="position:relative; top:-.1em">\scriptstyle\ \epsilon </math> will typically be chosen to be extremely small, e.g. 1/1000.) Stopping rules to achieve this have been developed<ref>{{cite journal|last=Gandy|first=Axel|title=Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk|journal=Journal of the American Statistical Association|year=2009|volume=104|issue=488|pages=1504–1511}}</ref> which can be incorporated with minimal additional computational cost. In fact, depending on the true underlying p-value it will often be found that the number of simulations required is remarkably small (e.g. as low as 5 and often not larger than 100) before a decision can be reached with virtual certainty. | |||
==See also== | |||
* [[Bootstrap aggregating|Bootstrap aggregating (Bagging)]] | |||
* [[Particle filter]] | |||
* [[Random permutation]] | |||
* [[Monte Carlo methods]] | |||
* [[Nonparametric statistics]] | |||
=== References === | |||
{{reflist}} | |||
*{{citation|last=Good|first=Phillip|authorlink=Phillip Good|year=2005|title=Permutation, Parametric and Bootstrap Tests of Hypotheses|edition=3rd|publisher=Springer}} | |||
== Bibliography == | |||
===Introductory statistics=== | |||
*Good, P. (2005) ''Introduction to Statistics Through Resampling Methods and R/S-PLUS''. Wiley. ISBN 0-471-71575-1 | |||
*Good, P. (2005) ''Introduction to Statistics Through Resampling Methods and Microsoft Office Excel''. Wiley. ISBN 0-471-73191-9 | |||
* Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein (2005). ''Bootstrap Methods and Permutation Tests''.{{full|date=November 2012}} | |||
* Wolter, K.M. (2007). ''Introduction to Variance Estimation''. Second Edition. Springer, Inc. | |||
====Bootstrapping==== | |||
*[[Bradley Efron|Efron, Bradley]] (1979). [http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.aos/1176344552 "Bootstrap methods: Another look at the jackknife"], ''[[The Annals of Statistics]]'', 7, 1-26. | |||
*[[Bradley Efron|Efron, Bradley]] (1981). "Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods", ''[[Biometrika]]'', 68, 589-599. | |||
*[[Bradley Efron|Efron, Bradley]] (1982). ''The jackknife, the bootstrap, and other resampling plans'', In ''Society of Industrial and Applied Mathematics CBMS-NSF Monographs'', 38. | |||
* [[Persi Diaconis|Diaconis, P.]]; [[Bradley Efron|Efron, Bradley]] (1983), "Computer-intensive methods in statistics," ''[[Scientific American]]'', May, 116-130. | |||
*[[Bradley Efron|Efron, Bradley]]; Tibshirani, Robert J. (1993). ''An introduction to the bootstrap'', New York: [[Chapman & Hall]], [http://lib.stat.cmu.edu/S/bootstrap.funs software]. | |||
*Davison, A. C. and Hinkley, D. V. (1997): Bootstrap Methods and their Application, [http://statwww.epfl.ch/davison/BMA/library.html software]. | |||
*Mooney, C Z & Duval, R D (1993). Bootstrapping. A Nonparametric Approach to Statistical Inference. Sage University Paper series on Quantitative Applications in the Social Sciences, 07-095. Newbury Park, CA: [[SAGE Publications|Sage]]. | |||
* Simon, J. L. (1997): [http://www.resample.com/content/text/index.shtml Resampling: The New Statistics]. | |||
====Jackknife==== | |||
* Berger, Y.G. (2007). A jackknife variance estimator for unistage stratified samples with unequal probabilities. ''[[Biometrika]]''. Vol. 94, 4, pp. 953–964. | |||
* Berger, Y.G. and Rao, J.N.K. (2006). Adjusted jackknife for imputation under unequal probability sampling without replacement. ''[[Journal of the Royal Statistical Society]]'' B. Vol. 68, 3, pp. 531–547. | |||
* Berger, Y.G. and Skinner, C.J. (2005). A jackknife variance estimator for unequal probability sampling. ''[[Journal of the Royal Statistical Society]]'' B. Vol. 67, 1, pp. 79–89. | |||
* Jiang, J., Lahiri, P. and Wan, S-M. (2002). A unified jackknife theory for empirical best prediction with M-estimation. ''[[The Annals of Statistics]]''. Vol. 30, 6, pp. 1782–810. | |||
* Jones, H.L. (1974). Jackknife estimation of functions of stratum means. ''[[Biometrika]]''. Vol. 61, 2, pp. 343–348. | |||
* Kish, L. and Frankel M.R. (1974). Inference from complex samples. ''[[Journal of the Royal Statistical Society]]'' B. Vol. 36, 1, pp. 1–37. | |||
* Krewski, D. and Rao, J.N.K. (1981). Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods. ''[[The Annals of Statistics]]''. Vol. 9, 5, pp. 1010–1019. | |||
* Quenouille, M.H. (1956). Notes on bias in estimation. ''[[Biometrika]]''. Vol. 43, pp. 353–360. | |||
* Rao, J.N.K. and Shao, J. (1992). Jackknife variance estimation with survey data under hot deck imputation. ''[[Biometrika]]''. Vol. 79, 4, pp. 811–822. | |||
* Rao, J.N.K., Wu, C.F.J. and Yue, K. (1992). Some recent work on resampling methods for complex surveys. ''[[Survey Methodology]]''. Vol. 18, 2, pp. 209–217. | |||
* Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc. | |||
* [[John Wilder Tukey|Tukey, J.W.]] (1958). Bias and confidence in not-quite large samples (abstract). ''[[The Annals of Mathematical Statistics]]''. Vol. 29, 2, pp. 614. | |||
* [[C.F. Jeff Wu|Wu, C.F.J.]] (1986). Jackknife, Bootstrap and other resampling methods in regression analysis. ''[[The Annals of Statistics]]''. Vol. 14, 4, pp. 1261–1295. | |||
===Monte Carlo methods=== | |||
*George S. Fishman (1995). ''Monte Carlo: Concepts, Algorithms, and Applications'', Springer, New York. ISBN 0-387-94527-X. | |||
*James E. Gentle (2009). ''Computational Statistics'', Springer, New York. Part III: Methods of Computational Statistics. ISBN 978-0-387-98143-7. | |||
*Dirk P. Kroese, Thomas Taimre and Zdravko I. Botev. ''Handbook of Monte Carlo Methods'', John Wiley & Sons, New York. ISBN 978-0-470-17793-8. | |||
*Christian P. Robert and George Casella (2004). ''Monte Carlo Statistical Methods'', Second ed., Springer, New York. ISBN 0-387-21239-6. | |||
*[[Shlomo Sawilowsky]] and Gail Fahoome (2003). ''Statistics via Monte Carlo Simulation with Fortran.'' Rochester Hills, MI: JMASM. ISBN 0-9740236-0-4. | |||
====Permutation test==== | |||
Original references: | |||
*[[R. A. Fisher|Fisher, R.A.]] (1935) ''[[The Design of Experiments]]'', New York: [[Hafner]] | |||
*[[E. J. G. Pitman|Pitman, E. J. G.]] (1937) "Significance tests which may be applied to samples from any population", ''Royal Statistical Society Supplement'', 4: 119-130 and 225-32 (parts I and II). {{jstor|2984124}} {{jstor|2983647}} | |||
*[[E. J. G. Pitman|Pitman, E. J. G.]] (1938) "Significance tests which may be applied to samples from any population. Part III. The analysis of variance test", ''[[Biometrika]]'', 29 (3-4): 322-335. {{doi|10.1093/biomet/29.3-4.322}} | |||
Modern references: | |||
*Collingridge, D.S. (2013). A Primer on Quantitized Data Analysis and Permutation Testing. Journal of Mixed Methods Research, 7(1), 79-95. | |||
*Edgington. E.S. (1995) ''Randomization tests'', 3rd ed. New York: [[Marcel-Dekker]] | |||
*Good, Phillip I. (2005) ''Permutation, Parametric and Bootstrap Tests of Hypotheses'', 3rd ed., [[Springer Science+Business Media|Springer]] ISBN 0-387-98898-X | |||
* Good, P. (2002) "Extensions of the concept of exchangeability and their applications", ''[[J. Modern Appl. Statist. Methods]]'', 1:243-247. | |||
*Lunneborg, Cliff. (1999) ''Data Analysis by Resampling'', Duxbury Press. ISBN 0-534-22110-6. | |||
*Pesarin, F. (2001). ''Multivariate Permutation Tests : With Applications in Biostatistics'', [[John Wiley & Sons]]. ISBN 978-0471496700 | |||
*Welch, W. J. (1990) "Construction of permutation tests", ''[[Journal of the American Statistical Association]]'', 85:693-698. | |||
Computational methods: | |||
*Mehta, C. R.; Patel, N. R. (1983). "A network algorithm for performing Fisher's exact test in r x c contingency tables", ''[[Journal of the American Statistical Association]]'', 78(382):427–434. | |||
*Metha, C. R.; Patel, N. R.; Senchaudhuri, P. (1988). "Importance sampling for estimating exact probabilities in permutational inference", ''[[Journal of the American Statistical Association]]'', 83(404):999–1005. | |||
*Gill, P. M. W. (2007). "Efficient calculation of p-values in linear-statistic permutation significance tests", ''Journal of Statistical Computation and Simulation '', 77(1):55-61. {{doi|10.1080/10629360500108053}} | |||
===Resampling methods=== | |||
* Good, P. (2006) ''Resampling Methods''. 3rd Ed. Birkhauser. | |||
* Wolter, K.M. (2007). ''Introduction to Variance Estimation''. 2nd Edition. Springer, Inc. | |||
==External links== | |||
===Current research on permutation tests=== | |||
*[http://people.revoledu.com/kardi/tutorial/Bootstrap/index.html Bootstrap Sampling tutorial] | |||
* Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein (2005): [http://bcs.whfreeman.com/ips5e/content/cat_080/pdf/moore14.pdf Bootstrap Methods and Permutation Tests], [http://www.insightful.com/Hesterberg/bootstrap software]. | |||
* Moore, D. S., G. McCabe, W. Duckworth, and S. Sclove (2003): [http://bcs.whfreeman.com/pbs/cat_140/chap18.pdf Bootstrap Methods and Permutation Tests] | |||
* Simon, J. L. (1997): [http://www.resample.com/content/text/index.shtml Resampling: The New Statistics]. | |||
* Yu, Chong Ho (2003): [http://PAREonline.net/getvn.asp?v=8&n=19 Resampling methods: concepts, applications, and justification. Practical Assessment, Research & Evaluation, 8(19)]. ''(statistical bootstrapping)'' | |||
* [http://www.ericdigests.org/1993/marriage.htm Resampling: A Marriage of Computers and Statistics (ERIC Digests)] | |||
===Software=== | |||
* [http://cran.at.r-project.org/web/packages/boot/index.html Angelo Canty and Brian Ripley (2010). '''boot''': Bootstrap R (S-Plus) Functions. R package version 1.2-43.] Functions and datasets for bootstrapping from the book ''Bootstrap Methods and Their Applications'' by A. C. Davison and D. V. Hinkley (1997, CUP). | |||
* [http://www.statistics101.net Statistics101: Resampling, Bootstrap, Monte Carlo Simulation program] | |||
* [http://cran.r-project.org/web/packages/samplingVarEst R package `samplingVarEst': Sampling Variance Estimation. Implements functions for estimating the sampling variance of some point estimators.] | |||
* [http://www.mansci.uwaterloo.ca/~msmucker/software.html Paired randomization/permutation test for evaluation of TREC results] | |||
* [https://github.com/searchivarius/PermTest Randomization/permutation tests to evaluate outcomes in information retrieval experiments (with and without adjustments for multiple comparisons).] | |||
* [http://www.bioconductor.org/packages/release/bioc/html/multtest.html Bioconductor resampling-based multiple hypothesis testing with Applications to Genomics.] | |||
* [http://cran.r-project.org/web/packages/permtest/index.html permtest: an R package to compare the variability within and distance between two groups within a set of microarray data.] | |||
{{DEFAULTSORT:Resampling (Statistics)}} | |||
[[Category:Monte Carlo methods]] | |||
[[Category:Statistical inference]] | |||
[[Category:Resampling (statistics)| ]] | |||
[[Category:Non-parametric statistics]] |
Latest revision as of 04:35, 15 March 2013
I'm Fernando (21) from Seltjarnarnes, Iceland.
I'm learning Norwegian literature at a local college and I'm just about to graduate.
I have a part time job in a the office.
my site; wellness [continue reading this..]
In statistics, resampling is any of a variety of methods for doing one of the following:
- Estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)
- Exchanging labels on data points when performing significance tests (permutation tests, also called exact tests, randomization tests, or re-randomization tests)
- Validating models by using random subsets (bootstrapping, cross validation)
Common resampling techniques include bootstrapping, jackknifing and permutation tests.
Bootstrap
Mining Engineer (Excluding Oil ) Truman from Alma, loves to spend time knotting, largest property developers in singapore developers in singapore and stamp collecting. Recently had a family visit to Urnes Stave Church. Bootstrapping is a statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter like a mean, median, proportion, odds ratio, correlation coefficient or regression coefficient. It may also be used for constructing hypothesis tests. It is often used as a robust alternative to inference based on parametric assumptions when those assumptions are in doubt, or where parametric inference is impossible or requires very complicated formulas for the calculation of standard errors.
Jackknife
Mining Engineer (Excluding Oil ) Truman from Alma, loves to spend time knotting, largest property developers in singapore developers in singapore and stamp collecting. Recently had a family visit to Urnes Stave Church. Jackknifing, which is similar to bootstrapping, is used in statistical inference to estimate the bias and standard error (variance) of a statistic, when a random sample of observations is used to calculate it. Historically this method preceded the invention of the bootstrap with Quenouille inventing this method in 1949 and Tukey extending it in 1958.[1][2] This method was foreshadowed by Mahalanobis who in 1946 suggested repeated estimates of the statistic of interest with half the sample chosen at random. [3] He coined the name 'interpenetrating samples' for this method.
Quenouille invented this method with the intention of reducing the bias of the sample estimate. Tukey extended this method by assuming that if the replicates could be considered identically and independently distributed, then an estimate of the variance of the sample parameter could be made and that it would be approximately distributed as a t variate with n - 1 degrees of freedom (n being the sample size).
The basic idea behind the jackknife variance estimator lies in systematically recomputing the statistic estimate, leaving out one or more observations at a time from the sample set. From this new set of replicates of the statistic, an estimate for the bias and an estimate for the variance of the statistic can be calculated.
Instead of using the jackknife to estimate the variance, it may instead be applied to the log of the variance. This transformation may result in better estimates particularly when the distribution of the variance itself may be non normal.
For many statistical parameters the jackknife estimate of variance tends asymptotically to the true value almost surely. In technical terms one says that the jackknife estimate is consistent. The jackknife is consistent for the sample means, sample variances, central and non-central t-statistics (with possibly non-normal populations), sample coefficient of variation, maximum likelihood estimators, least squares estimators, correlation coefficients and regression coefficients.
It is not consistent for the sample median. In the case of a unimodal variate the ratio of the jackknife variance to the sample variance tends to be distributed as one half the square of a chi square distribution with two degrees of freedom.
The jackknife, like the original bootstrap, is dependent on the independence of the data. Extensions of the jackknife to allow for dependence in the data have been proposed.
Another extension is the delete a group method used in association with Poisson sampling.
Comparison of Bootstrap and Jackknife
Both methods, the bootstrap and the jackknife, estimate the variability of a statistic from the variability of that statistic between subsamples, rather than from parametric assumptions. For the more general jackknife, the delete-m observations jackknife, the bootstrap can be seen as a random approximation of it. Both yield similar numerical results, which is why each can be seen as approximation to the other. Although there are huge theoretical differences in their mathematical insights, the main practical difference for statistics users is that the bootstrap gives different results when repeated on the same data, whereas the jackknife gives exactly the same result each time. Because of this, the jackknife is popular when the estimates need to be verified several times before publishing (e.g., official statistics agencies). On the other hand, when this verification feature is not crucial and it is of interest not to have a number but just an idea of its distribution, the bootstrap is preferred (e.g., studies in physics, economics, biological sciences).
Whether to use the bootstrap or the jackknife may depend more on operational aspects than on statistical concerns of a survey. The jackknife, originally used for bias reduction, is more of a specialized method and only estimates the variance of the point estimator. This can be enough for basic statistical inference (e.g., hypothesis testing, confidence intervals). The bootstrap, on the other hand, first estimates the whole distribution (of the point estimator) and then computes the variance from that. While powerful and easy, this can become highly computer intensive.
"The bootstrap can be applied to both variance and distribution estimation problems. However, the bootstrap variance estimator is not as good as the jackknife or the balanced repeated replication (BRR) variance estimator in terms of the empirical results. Furthermore, the bootstrap variance estimator usually requires more computations than the jackknife or the BRR. Thus, the bootstrap is mainly recommended for distribution estimation." [4]
There is a special consideration with the jackknife, particularly with the delete-1 observation jackknife. It should only be used with smooth, differentiable statistics (e.g., totals, means, proportions, ratios, odd ratios, regression coefficients, etc.; not with medians or quantiles). This may become a practical disadvantage (or not, depending on the needs of the user). This disadvantage is usually the argument favoring bootstrapping over jackknifing. More general jackknifes than the delete-1, such as the delete-m jackknife, overcome this problem for the medians and quantiles by relaxing the smoothness requirements for consistent variance estimation.
Usually the jackknife is easier to apply to complex sampling schemes than the bootstrap. Complex sampling schemes may involve stratification, multiple stages (clustering), varying sampling weights (non-response adjustments, calibration, post-stratification) and under unequal-probability sampling designs. Theoretical aspects of both the bootstrap and the jackknife can be found in Shao and Tu (1995),[5] whereas a basic introduction is accounted in Wolter (2007).[6]
Cross-validation
Mining Engineer (Excluding Oil ) Truman from Alma, loves to spend time knotting, largest property developers in singapore developers in singapore and stamp collecting. Recently had a family visit to Urnes Stave Church. Cross-validation is a statistical method for validating a predictive model. Subsets of the data are held out for use as validating sets; a model is fit to the remaining data (a training set) and used to predict for the validation set. Averaging the quality of the predictions across the validation sets yields an overall measure of prediction accuracy.
One form of cross-validation leaves out a single observation at a time; this is similar to the jackknife. Another, K-fold cross-validation, splits the data into K subsets; each is held out in turn as the validation set.
This avoids "self-influence". For comparison, in regression analysis methods such as linear regression, each y value draws the regression line toward itself, making the prediction of that value appear more accurate than it really is. Cross-validation applied to linear regression predicts the y value for each observation without using that observation.
This is often used for deciding how many predictor variables to use in regression. Without cross-validation, adding predictors always reduces the residual sum of squares (or possibly leaves it unchanged). In contrast, the cross-validated mean-square error will tend to decrease if valuable predictors are added, but increase if worthless predictors are added.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.
Permutation tests
Mining Engineer (Excluding Oil ) Truman from Alma, loves to spend time knotting, largest property developers in singapore developers in singapore and stamp collecting. Recently had a family visit to Urnes Stave Church. A permutation test (also called a randomization test, re-randomization test, or an exact test) is a type of statistical significance test in which the distribution of the test statistic under the null hypothesis is obtained by calculating all possible values of the test statistic under rearrangements of the labels on the observed data points. In other words, the method by which treatments are allocated to subjects in an experimental design is mirrored in the analysis of that design. If the labels are exchangeable under the null hypothesis, then the resulting tests yield exact significance levels; see also exchangeability. Confidence intervals can then be derived from the tests. The theory has evolved from the works of R.A. Fisher and E.J.G. Pitman in the 1930s.
To illustrate the basic idea of a permutation test, suppose we have two groups and whose sample means are and , and that we want to test, at 5% significance level, whether they come from the same distribution. Let and be the sample size corresponding to each group. The permutation test is designed to determine whether the observed difference between the sample means is large enough to reject the null hypothesis H that the two groups have identical probability distribution.
The test proceeds as follows. First, the difference in means between the two samples is calculated: this is the observed value of the test statistic, T(obs). Then the observations of groups and are pooled.
Next, the difference in sample means is calculated and recorded for every possible way of dividing these pooled values into two groups of size and (i.e., for every permutation of the group labels A and B). The set of these calculated differences is the exact distribution of possible differences under the null hypothesis that group label does not matter.
The one-sided p-value of the test is calculated as the proportion of sampled permutations where the difference in means was greater than or equal to T(obs). The two-sided p-value of the test is calculated as the proportion of sampled permutations where the absolute difference was greater than or equal to ABS(T(obs)).
If the only purpose of the test is reject or not reject the null hypothesis, we can as an alternative sort the recorded differences, and then observe if T(obs) is contained within the middle 95% of them. If it is not, we reject the hypothesis of identical probability curves at the 5% significance level.
Relation to parametric tests
Permutation tests are a subset of non-parametric statistics. The basic premise is to use only the assumption that it is possible that all of the treatment groups are equivalent, and that every member of them is the same before sampling began (i.e. the slot that they fill is not differentiable from other slots before the slots are filled). From this, one can calculate a statistic and then see to what extent this statistic is special by seeing how likely it would be if the treatment assignments had been jumbled.
In contrast to permutation tests, the reference distributions for many popular "classical" statistical tests, such as the t-test, F-test, z-test and χ2 test, are obtained from theoretical probability distributions. Fisher's exact test is an example of a commonly used permutation test for evaluating the association between two dichotomous variables. When sample sizes are large, the Pearson's chi-square test will give accurate results. For small samples, the chi-square reference distribution cannot be assumed to give a correct description of the probability distribution of the test statistic, and in this situation the use of Fisher's exact test becomes more appropriate. A rule of thumb is that the expected count in each cell of the table should be greater than 5 before Pearson's chi-squared test is used.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.
Permutation tests exist in many situations where parametric tests do not (e.g., when deriving an optimal test when losses are proportional to the size of an error rather than its square). All simple and many relatively complex parametric tests have a corresponding permutation test version that is defined by using the same test statistic as the parametric test, but obtains the p-value from the sample-specific permutation distribution of that statistic, rather than from the theoretical distribution derived from the parametric assumption. For example, it is possible in this manner to construct a permutation t-test, a permutation chi-squared test of association, a permutation version of Aly's test for comparing variances and so on.
The major down-side to permutation tests are that they
- Can be computationally intensive and may require "custom" code for difficult-to-calculate statistics. This must be rewritten for every case.
- Are primarily used to provide a p-value. The inversion of the test to get confidence regions/intervals requires even more computation.
Advantages
Permutation tests exist for any test statistic, regardless of whether or not its distribution is known. Thus one is always free to choose the statistic which best discriminates between hypothesis and alternative and which minimizes losses.
Permutation tests can be used for analyzing unbalanced designs [7] and for combining dependent tests on mixtures of categorical, ordinal, and metric data (Pesarin, 2001). They can also be used to analyze qualitative data that has been quantitized (i.e., turned into numbers). Permutation tests may be ideal for analyzing quantitized data that do not satisfy statistical assumptions underlying traditional parametric tests (e.g., t-tests, ANOVA) (Collingridge, 2013).
Before the 1980s, the burden of creating the reference distribution was overwhelming except for data sets with small sample sizes.
Since the 1980s, the confluence of relatively inexpensive fast computers and the development of new sophisticated path algorithms applicable in special situations, made the application of permutation test methods practical for a wide range of problems. It also initiated the addition of exact-test options in the main statistical software packages and the appearance of specialized software for performing a wide range of uni- and multi-variable exact tests and computing test-based "exact" confidence intervals.
Limitations
An important assumption behind a permutation test is that the observations are exchangeable under the null hypothesis. An important consequence of this assumption is that tests of difference in location (like a permutation t-test) require equal variance. In this respect, the permutation t-test shares the same weakness as the classical Student's t-test (the Behrens–Fisher problem). A third alternative in this situation is to use a bootstrap-based test. Good (2005) explains the difference between permutation tests and bootstrap tests the following way: "Permutations test hypotheses concerning distributions; bootstraps test hypotheses concerning parameters. As a result, the bootstrap entails less-stringent assumptions." Of course, bootstrap tests are not exact.
Monte Carlo testing
An asymptotically equivalent permutation test can be created when there are too many possible orderings of the data to allow complete enumeration in a convenient manner. This is done by generating the reference distribution by Monte Carlo sampling, which takes a small (relative to the total number of permutations) random sample of the possible replicates. The realization that this could be applied to any permutation test on any dataset was an important breakthrough in the area of applied statistics. The earliest known reference to this approach is Dwass (1957).[8] This type of permutation test is known under various names: approximate permutation test, Monte Carlo permutation tests or random permutation tests.[9]
After random permutations, it is possible to obtain a confidence interval for the p-value based on the Binomial distribution. For example, if after random permutations the p-value is estimated to be , then a 99% confidence interval for the true (the one that would result from trying all possible permutations) is .
On the other hand, the purpose of estimating the p-value is most often to decide whether , where is the threshold at which the null hypothesis will be rejected (typically ). In the example above, the confidence interval only tells us that there is roughly a 50% chance that the p-value is smaller than 0.05, i.e. it is completely unclear whether the null hypothesis should be rejected at a level .
If it is only important to know whether for a given , it is logical to continue simulating until the statement can be established to be true or false with a very low probability of error. Given a bound on the admissible probability of error (the probability of finding that when in fact or vice versa), the question of how many permutations to generate can be seen as the question of when to stop generating permutations, based on the outcomes of the simulations so far, in order to guarantee that the conclusion (which is either or ) is correct with probability at least as large as . ( will typically be chosen to be extremely small, e.g. 1/1000.) Stopping rules to achieve this have been developed[10] which can be incorporated with minimal additional computational cost. In fact, depending on the true underlying p-value it will often be found that the number of simulations required is remarkably small (e.g. as low as 5 and often not larger than 100) before a decision can be reached with virtual certainty.
See also
- Bootstrap aggregating (Bagging)
- Particle filter
- Random permutation
- Monte Carlo methods
- Nonparametric statistics
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- Many property agents need to declare for the PIC grant in Singapore. However, not all of them know find out how to do the correct process for getting this PIC scheme from the IRAS. There are a number of steps that you need to do before your software can be approved.
Naturally, you will have to pay a safety deposit and that is usually one month rent for annually of the settlement. That is the place your good religion deposit will likely be taken into account and will kind part or all of your security deposit. Anticipate to have a proportionate amount deducted out of your deposit if something is discovered to be damaged if you move out. It's best to you'll want to test the inventory drawn up by the owner, which can detail all objects in the property and their condition. If you happen to fail to notice any harm not already mentioned within the inventory before transferring in, you danger having to pay for it yourself.
In case you are in search of an actual estate or Singapore property agent on-line, you simply should belief your intuition. It's because you do not know which agent is nice and which agent will not be. Carry out research on several brokers by looking out the internet. As soon as if you end up positive that a selected agent is dependable and reliable, you can choose to utilize his partnerise in finding you a home in Singapore. Most of the time, a property agent is taken into account to be good if he or she locations the contact data on his website. This may mean that the agent does not mind you calling them and asking them any questions relating to new properties in singapore in Singapore. After chatting with them you too can see them in their office after taking an appointment.
Have handed an trade examination i.e Widespread Examination for House Brokers (CEHA) or Actual Property Agency (REA) examination, or equal; Exclusive brokers are extra keen to share listing information thus making certain the widest doable coverage inside the real estate community via Multiple Listings and Networking. Accepting a severe provide is simpler since your agent is totally conscious of all advertising activity related with your property. This reduces your having to check with a number of agents for some other offers. Price control is easily achieved. Paint work in good restore-discuss with your Property Marketing consultant if main works are still to be done. Softening in residential property prices proceed, led by 2.8 per cent decline within the index for Remainder of Central Region
Once you place down the one per cent choice price to carry down a non-public property, it's important to accept its situation as it is whenever you move in – faulty air-con, choked rest room and all. Get round this by asking your agent to incorporate a ultimate inspection clause within the possibility-to-buy letter. HDB flat patrons routinely take pleasure in this security net. "There's a ultimate inspection of the property two days before the completion of all HDB transactions. If the air-con is defective, you can request the seller to repair it," says Kelvin.
15.6.1 As the agent is an intermediary, generally, as soon as the principal and third party are introduced right into a contractual relationship, the agent drops out of the image, subject to any problems with remuneration or indemnification that he could have against the principal, and extra exceptionally, against the third occasion. Generally, agents are entitled to be indemnified for all liabilities reasonably incurred within the execution of the brokers´ authority.
To achieve the very best outcomes, you must be always updated on market situations, including past transaction information and reliable projections. You could review and examine comparable homes that are currently available in the market, especially these which have been sold or not bought up to now six months. You'll be able to see a pattern of such report by clicking here It's essential to defend yourself in opposition to unscrupulous patrons. They are often very skilled in using highly unethical and manipulative techniques to try and lure you into a lure. That you must also protect your self, your loved ones, and personal belongings as you'll be serving many strangers in your home. Sign a listing itemizing of all of the objects provided by the proprietor, together with their situation. HSR Prime Recruiter 2010
Bibliography
Introductory statistics
- Good, P. (2005) Introduction to Statistics Through Resampling Methods and R/S-PLUS. Wiley. ISBN 0-471-71575-1
- Good, P. (2005) Introduction to Statistics Through Resampling Methods and Microsoft Office Excel. Wiley. ISBN 0-471-73191-9
- Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein (2005). Bootstrap Methods and Permutation Tests.Template:Full
- Wolter, K.M. (2007). Introduction to Variance Estimation. Second Edition. Springer, Inc.
Bootstrapping
- Efron, Bradley (1979). "Bootstrap methods: Another look at the jackknife", The Annals of Statistics, 7, 1-26.
- Efron, Bradley (1981). "Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods", Biometrika, 68, 589-599.
- Efron, Bradley (1982). The jackknife, the bootstrap, and other resampling plans, In Society of Industrial and Applied Mathematics CBMS-NSF Monographs, 38.
- Diaconis, P.; Efron, Bradley (1983), "Computer-intensive methods in statistics," Scientific American, May, 116-130.
- Efron, Bradley; Tibshirani, Robert J. (1993). An introduction to the bootstrap, New York: Chapman & Hall, software.
- Davison, A. C. and Hinkley, D. V. (1997): Bootstrap Methods and their Application, software.
- Mooney, C Z & Duval, R D (1993). Bootstrapping. A Nonparametric Approach to Statistical Inference. Sage University Paper series on Quantitative Applications in the Social Sciences, 07-095. Newbury Park, CA: Sage.
- Simon, J. L. (1997): Resampling: The New Statistics.
Jackknife
- Berger, Y.G. (2007). A jackknife variance estimator for unistage stratified samples with unequal probabilities. Biometrika. Vol. 94, 4, pp. 953–964.
- Berger, Y.G. and Rao, J.N.K. (2006). Adjusted jackknife for imputation under unequal probability sampling without replacement. Journal of the Royal Statistical Society B. Vol. 68, 3, pp. 531–547.
- Berger, Y.G. and Skinner, C.J. (2005). A jackknife variance estimator for unequal probability sampling. Journal of the Royal Statistical Society B. Vol. 67, 1, pp. 79–89.
- Jiang, J., Lahiri, P. and Wan, S-M. (2002). A unified jackknife theory for empirical best prediction with M-estimation. The Annals of Statistics. Vol. 30, 6, pp. 1782–810.
- Jones, H.L. (1974). Jackknife estimation of functions of stratum means. Biometrika. Vol. 61, 2, pp. 343–348.
- Kish, L. and Frankel M.R. (1974). Inference from complex samples. Journal of the Royal Statistical Society B. Vol. 36, 1, pp. 1–37.
- Krewski, D. and Rao, J.N.K. (1981). Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods. The Annals of Statistics. Vol. 9, 5, pp. 1010–1019.
- Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika. Vol. 43, pp. 353–360.
- Rao, J.N.K. and Shao, J. (1992). Jackknife variance estimation with survey data under hot deck imputation. Biometrika. Vol. 79, 4, pp. 811–822.
- Rao, J.N.K., Wu, C.F.J. and Yue, K. (1992). Some recent work on resampling methods for complex surveys. Survey Methodology. Vol. 18, 2, pp. 209–217.
- Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc.
- Tukey, J.W. (1958). Bias and confidence in not-quite large samples (abstract). The Annals of Mathematical Statistics. Vol. 29, 2, pp. 614.
- Wu, C.F.J. (1986). Jackknife, Bootstrap and other resampling methods in regression analysis. The Annals of Statistics. Vol. 14, 4, pp. 1261–1295.
Monte Carlo methods
- George S. Fishman (1995). Monte Carlo: Concepts, Algorithms, and Applications, Springer, New York. ISBN 0-387-94527-X.
- James E. Gentle (2009). Computational Statistics, Springer, New York. Part III: Methods of Computational Statistics. ISBN 978-0-387-98143-7.
- Dirk P. Kroese, Thomas Taimre and Zdravko I. Botev. Handbook of Monte Carlo Methods, John Wiley & Sons, New York. ISBN 978-0-470-17793-8.
- Christian P. Robert and George Casella (2004). Monte Carlo Statistical Methods, Second ed., Springer, New York. ISBN 0-387-21239-6.
- Shlomo Sawilowsky and Gail Fahoome (2003). Statistics via Monte Carlo Simulation with Fortran. Rochester Hills, MI: JMASM. ISBN 0-9740236-0-4.
Permutation test
Original references:
- Fisher, R.A. (1935) The Design of Experiments, New York: Hafner
- Pitman, E. J. G. (1937) "Significance tests which may be applied to samples from any population", Royal Statistical Society Supplement, 4: 119-130 and 225-32 (parts I and II). Template:Jstor Template:Jstor
- Pitman, E. J. G. (1938) "Significance tests which may be applied to samples from any population. Part III. The analysis of variance test", Biometrika, 29 (3-4): 322-335. 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
Modern references:
- Collingridge, D.S. (2013). A Primer on Quantitized Data Analysis and Permutation Testing. Journal of Mixed Methods Research, 7(1), 79-95.
- Edgington. E.S. (1995) Randomization tests, 3rd ed. New York: Marcel-Dekker
- Good, Phillip I. (2005) Permutation, Parametric and Bootstrap Tests of Hypotheses, 3rd ed., Springer ISBN 0-387-98898-X
- Good, P. (2002) "Extensions of the concept of exchangeability and their applications", J. Modern Appl. Statist. Methods, 1:243-247.
- Lunneborg, Cliff. (1999) Data Analysis by Resampling, Duxbury Press. ISBN 0-534-22110-6.
- Pesarin, F. (2001). Multivariate Permutation Tests : With Applications in Biostatistics, John Wiley & Sons. ISBN 978-0471496700
- Welch, W. J. (1990) "Construction of permutation tests", Journal of the American Statistical Association, 85:693-698.
Computational methods:
- Mehta, C. R.; Patel, N. R. (1983). "A network algorithm for performing Fisher's exact test in r x c contingency tables", Journal of the American Statistical Association, 78(382):427–434.
- Metha, C. R.; Patel, N. R.; Senchaudhuri, P. (1988). "Importance sampling for estimating exact probabilities in permutational inference", Journal of the American Statistical Association, 83(404):999–1005.
- Gill, P. M. W. (2007). "Efficient calculation of p-values in linear-statistic permutation significance tests", Journal of Statistical Computation and Simulation , 77(1):55-61. 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
Resampling methods
- Good, P. (2006) Resampling Methods. 3rd Ed. Birkhauser.
- Wolter, K.M. (2007). Introduction to Variance Estimation. 2nd Edition. Springer, Inc.
External links
Current research on permutation tests
- Bootstrap Sampling tutorial
- Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein (2005): Bootstrap Methods and Permutation Tests, software.
- Moore, D. S., G. McCabe, W. Duckworth, and S. Sclove (2003): Bootstrap Methods and Permutation Tests
- Simon, J. L. (1997): Resampling: The New Statistics.
- Yu, Chong Ho (2003): Resampling methods: concepts, applications, and justification. Practical Assessment, Research & Evaluation, 8(19). (statistical bootstrapping)
- Resampling: A Marriage of Computers and Statistics (ERIC Digests)
Software
- Angelo Canty and Brian Ripley (2010). boot: Bootstrap R (S-Plus) Functions. R package version 1.2-43. Functions and datasets for bootstrapping from the book Bootstrap Methods and Their Applications by A. C. Davison and D. V. Hinkley (1997, CUP).
- Statistics101: Resampling, Bootstrap, Monte Carlo Simulation program
- R package `samplingVarEst': Sampling Variance Estimation. Implements functions for estimating the sampling variance of some point estimators.
- Paired randomization/permutation test for evaluation of TREC results
- Randomization/permutation tests to evaluate outcomes in information retrieval experiments (with and without adjustments for multiple comparisons).
- Bioconductor resampling-based multiple hypothesis testing with Applications to Genomics.
- permtest: an R package to compare the variability within and distance between two groups within a set of microarray data.
- ↑ Quenouille M (1949) Approximate tests of correlation in time series. J Roy Stat Soc Series B 11: 68-84
- ↑ Tukey JW (1958) Bias and confidence in not quite large samples (abstract). Ann Math Stats 29: 614
- ↑ Mahalanobis PC (1946). Recent experiments in statistical sampling in the Indian Statistical Institute. J Roy Stat Soc 109: 325-370
- ↑ Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc. pp. 281.
- ↑ Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc.
- ↑ Wolter, K.M. (2007). Introduction to Variance Estimation. Second Edition. Springer, Inc.
- ↑ http://tbf.coe.wayne.edu/jmasm/vol1_no2.pdf
- ↑ Meyer Dwass, "Modified Randomization Tests for Nonparametric Hypotheses", The Annals of Mathematical Statistics, 28:181-187, 1957.
- ↑ One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting
In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang
Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules
Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.
A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running
The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more
There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang - ↑ One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting
In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang
Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules
Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.
A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running
The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more
There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang