Associated prime: Difference between revisions
en>Rschwieb Using sfn template |
en>Yobot m Reference before punctuation detected and fixed using AWB (9585) |
||
Line 1: | Line 1: | ||
'''Algebraic statistics''' is the use of [[algebra]] to advance [[statistics]]. Algebra has been useful for [[design of experiments|experimental design]], [[parameter estimation]], and [[hypothesis testing]]. | |||
Traditionally, algebraic statistics has been associated with the design of experiments and [[multivariate analysis]] (especially [[time series]]). In recent years, the term "algebraic statistics" has been sometimes restricted, sometimes being used to label the use of [[algebraic geometry]] and [[commutative algebra]] in statistics. | |||
==The tradition of algebraic statistics== | |||
In the past, statisticians have used algebra to advance research in statistics. Some algebraic statistics led to the development of new topics in algebra and combinatorics, such as [[association scheme]]s. | |||
===Design of experiments=== | |||
For example, [[Ronald A. Fisher]], [[Henry Mann|Henry B. Mann]], and [[Rosemary A. Bailey]] applied [[Abelian group]]s to the [[design of experiments]]. Experimental designs were also studied with [[affine geometry]] over [[finite fields]] and then with the introduction of [[association scheme]]s by [[R. C. Bose]]. [[Orthogonal array]]s were introduced by [[C. R. Rao]] also for experimental designs. | |||
===Algebraic analysis and abstract statistical inference=== | |||
[[Haar measure|Invariant measures]] on [[locally compact group]]s have long been used in [[statistical theory]], particularly in [[multivariate analysis]]. [[Arne Beurling|Beurling]]'s [[invariant subspace|factorization theorem]] and much of the work on (abstract) [[harmonic analysis]] sought better understanding of the [[Wold's theorem|Wold]] [[Wold decomposition|decomposition]] of [[stationary stochastic process]]es, which is important in [[time series]] statistics. | |||
Encompassing previous results on probability theory on algebraic structures, [[Ulf Grenander]] developed a theory of "abstract inference". Grenander's abstract inference and his [[Pattern theory|theory of patterns]] are useful for [[spatial statistics]] and [[image analysis]]; these theories rely on [[lattice theory]]. | |||
===Partially ordered sets and lattices=== | |||
[[Ordered vector space|Partially ordered vector space]]s and [[Riesz space|vector lattices]] are used throughout statistical theory. [[Garrett Birkhoff]] metrized the positive cone using [[Hilbert metric|Hilbert's projective metric]] and proved [[Perron–Frobenius theorem|Jentsch's theorem]] using the [[contraction mapping]] [[contraction mapping theorem|theorem]].<ref>A gap in [[Garrett Birkhoff]]'s original proof was filled by [[Alexander Ostrowski]]. | |||
* [[Garrett Birkhoff]], 1967. ''Lattice Theory'', 3rd ed. Vol. 25 of AMS Colloquium Publications. [[American Mathematical Society]]. | |||
</ref> Birkhoff's results have been used for [[maximum entropy]] [[estimation]] (which can be viewed as [[linear programming]] in [[infinite dimensional optimization|infinite dimensions]]) by [[Jonathan Borwein]] and colleagues. | |||
[[Riesz space|Vector lattice]]s and [[Riesz space|conical measure]]s were introduced into [[statistical decision theory]] by [[Lucien Le Cam]]. | |||
==Recent work using commutative algebra and algebraic geometry== | |||
In recent years, the term "algebraic statistics" has been used more restrictively, to label the use of [[algebraic geometry]] and [[commutative algebra]] to study problems related to [[discrete random variable]]s with finite state spaces. Commutative algebra and algebraic geometry have applications in statistics because many commonly used classes of discrete random variables can be viewed as [[algebraic variety|algebraic varieties]]. | |||
===Introductory example=== | |||
Consider a [[random variable]] ''X'' which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities | |||
:<math>p_i=\mathrm{Pr}(X=i),\quad i=0,1,2</math> | |||
and these numbers clearly satisfy | |||
:<math>\sum_{i=0}^2 p_i = 1 \quad \mbox{and}\quad 0\leq p_i \leq 1.</math> | |||
Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable ''X'' with the tuple (''p''<sub>0</sub>,''p''<sub>1</sub>,''p''<sub>2</sub>)∈'''R'''<sup>3</sup>. | |||
Now suppose ''X'' is a [[Binomial random variable]] with parameter ''p = 1 − q'' and ''n = 2'', i.e. ''X'' represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of ''q''. Then | |||
:<math>p_i=\mathrm{Pr}(X=i)={2 \choose i}q^i (1-q)^{2-i}</math> | |||
and it is not hard to show that the tuples (''p''<sub>0</sub>,''p''<sub>1</sub>,''p''<sub>2</sub>) which arise in this way are precisely the ones satisfying | |||
:<math>4 p_0 p_2-p_1^2=0.\ </math> | |||
The latter is a polynomial equation defining an algebraic variety (or surface) in '''R'''<sup>3</sup>, and this variety, when intersected with the [[simplex]] given by | |||
:<math>\sum_{i=0}^2 p_i = 1 \quad \mbox{and}\quad 0\leq p_i \leq 1,</math> | |||
yields a piece of an [[algebraic curve]] which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter ''q'' amounts to locating one point on this curve; testing the hypothesis that a given variable ''X'' is Bernoulli amounts to testing whether a certain point lies on that curve or not. | |||
==References== | |||
<references/> | |||
* [[R. A. Bailey]]. [http://www.maths.qmul.ac.uk/~rab/Asbook/ ''Association Schemes: Designed Experiments, Algebra and Combinatorics''], [http://titles.cambridge.org/catalogue.asp?isbn=052182446X Cambridge University Press], Cambridge, 2004. 387pp. ISBN 0-521-82446-X. (Chapters from preliminary draft are available on-line) | |||
* {{cite book | |||
|author=Caliński, Tadeusz and Kageyama, Sanpei | |||
|title=Block designs: A Randomization approach, Volume '''II''': Design | |||
|series=Lecture Notes in Statistics | |||
|volume=170 | |||
|publisher=Springer-Verlag | |||
|location=New York | |||
|year=2003 | |||
|isbn=0-387-95470-8 | |||
}} | |||
*{{cite book | |||
|author=Hinkelmann, Klaus and [[Oscar Kempthorne|Kempthorne, Oscar]] | |||
|year=2005 | |||
|title=Design and Analysis of Experiments, Volume 2: Advanced Experimental Design | |||
|url=http://books.google.com/books?id=GiYc5nRVKf8C | |||
|edition=First | |||
|publisher=[http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471551775.html Wiley] | |||
|isbn=978-0-471-55177-5 | |||
}} | |||
* [[Henry Mann|H. B. Mann]]. 1949. ''Analysis and Design of Experiments: Analysis of Variance and Analysis-of-Variance Designs''. Dover. | |||
* {{cite book | |||
|title=Constructions and Combinatorial Problems in Design of Experiments | |||
|author=[[Damaraju Raghavarao|Raghavarao, Damaraju]] | |||
|location=New York | |||
|year=1988 | |||
|edition=corrected reprint of the 1971 Wiley | |||
|publisher=Dover | |||
}} | |||
* {{cite book | |||
|title=Block Designs: Analysis, Combinatorics and Applications | |||
|author=[[Damaraju Raghavarao|Raghavarao, Damaraju]] and Padgett, L.V. | |||
|location= | |||
|year=2005 | |||
|edition= | |||
|publisher=World Scientific | |||
}} | |||
*{{cite book | |||
|author=Street, Anne Penfold and Street, Deborah J. | |||
|title=Combinatorics of Experimental Design | |||
|publisher=Oxford U. P. [Clarendon] | |||
|year=1987 | |||
|pages=400+xiv | |||
|isbn=0-19-853256-3 | |||
}} | |||
* [http://www.math.harvard.edu/~seths/assc.html Algebraic Statistics Short Course], lecture notes by Seth Sullivant | |||
* L. Pachter and [[Bernd Sturmfels|B. Sturmfels]]. ''Algebraic Statistics for Computational Biology.'' Cambridge University Press 2005. | |||
* G. Pistone, E. Riccomango, H. P. Wynn. ''Algebraic Statistics.'' CRC Press, 2001. | |||
* Drton, Mathias, Sturmfels, Bernd, Sullivant, Seth . ''Lectures on Algebraic Statistics'', Springer 2009. | |||
* Paolo Gibilisco, Eva Riccomagno, Maria-Piera Rogantin, [[Henry_Wynn|Henry P. Wynn]]. ''Algebraic and Geometric Methods in Statistics'', Cambridge 2009. | |||
== External links == | |||
* [http://www.jalgstat.com Journal of Algebraic Statistics] | |||
{{DEFAULTSORT:Algebraic Statistics}} | |||
[[Category:Discrete distributions]] | |||
[[Category:Theory of probability distributions]] |
Latest revision as of 00:23, 8 November 2013
Algebraic statistics is the use of algebra to advance statistics. Algebra has been useful for experimental design, parameter estimation, and hypothesis testing.
Traditionally, algebraic statistics has been associated with the design of experiments and multivariate analysis (especially time series). In recent years, the term "algebraic statistics" has been sometimes restricted, sometimes being used to label the use of algebraic geometry and commutative algebra in statistics.
The tradition of algebraic statistics
In the past, statisticians have used algebra to advance research in statistics. Some algebraic statistics led to the development of new topics in algebra and combinatorics, such as association schemes.
Design of experiments
For example, Ronald A. Fisher, Henry B. Mann, and Rosemary A. Bailey applied Abelian groups to the design of experiments. Experimental designs were also studied with affine geometry over finite fields and then with the introduction of association schemes by R. C. Bose. Orthogonal arrays were introduced by C. R. Rao also for experimental designs.
Algebraic analysis and abstract statistical inference
Invariant measures on locally compact groups have long been used in statistical theory, particularly in multivariate analysis. Beurling's factorization theorem and much of the work on (abstract) harmonic analysis sought better understanding of the Wold decomposition of stationary stochastic processes, which is important in time series statistics.
Encompassing previous results on probability theory on algebraic structures, Ulf Grenander developed a theory of "abstract inference". Grenander's abstract inference and his theory of patterns are useful for spatial statistics and image analysis; these theories rely on lattice theory.
Partially ordered sets and lattices
Partially ordered vector spaces and vector lattices are used throughout statistical theory. Garrett Birkhoff metrized the positive cone using Hilbert's projective metric and proved Jentsch's theorem using the contraction mapping theorem.[1] Birkhoff's results have been used for maximum entropy estimation (which can be viewed as linear programming in infinite dimensions) by Jonathan Borwein and colleagues.
Vector lattices and conical measures were introduced into statistical decision theory by Lucien Le Cam.
Recent work using commutative algebra and algebraic geometry
In recent years, the term "algebraic statistics" has been used more restrictively, to label the use of algebraic geometry and commutative algebra to study problems related to discrete random variables with finite state spaces. Commutative algebra and algebraic geometry have applications in statistics because many commonly used classes of discrete random variables can be viewed as algebraic varieties.
Introductory example
Consider a random variable X which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities
and these numbers clearly satisfy
Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable X with the tuple (p0,p1,p2)∈R3.
Now suppose X is a Binomial random variable with parameter p = 1 − q and n = 2, i.e. X represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of q. Then
and it is not hard to show that the tuples (p0,p1,p2) which arise in this way are precisely the ones satisfying
The latter is a polynomial equation defining an algebraic variety (or surface) in R3, and this variety, when intersected with the simplex given by
yields a piece of an algebraic curve which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter q amounts to locating one point on this curve; testing the hypothesis that a given variable X is Bernoulli amounts to testing whether a certain point lies on that curve or not.
References
- ↑ A gap in Garrett Birkhoff's original proof was filled by Alexander Ostrowski.
- Garrett Birkhoff, 1967. Lattice Theory, 3rd ed. Vol. 25 of AMS Colloquium Publications. American Mathematical Society.
- R. A. Bailey. Association Schemes: Designed Experiments, Algebra and Combinatorics, Cambridge University Press, Cambridge, 2004. 387pp. ISBN 0-521-82446-X. (Chapters from preliminary draft are available on-line)
- 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - H. B. Mann. 1949. Analysis and Design of Experiments: Analysis of Variance and Analysis-of-Variance Designs. Dover.
- 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.
My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 - Algebraic Statistics Short Course, lecture notes by Seth Sullivant
- L. Pachter and B. Sturmfels. Algebraic Statistics for Computational Biology. Cambridge University Press 2005.
- G. Pistone, E. Riccomango, H. P. Wynn. Algebraic Statistics. CRC Press, 2001.
- Drton, Mathias, Sturmfels, Bernd, Sullivant, Seth . Lectures on Algebraic Statistics, Springer 2009.
- Paolo Gibilisco, Eva Riccomagno, Maria-Piera Rogantin, Henry P. Wynn. Algebraic and Geometric Methods in Statistics, Cambridge 2009.