Emergy: Difference between revisions
en>ShelfSkewed m fix isbn, rm notice |
en>Drali1964 |
||
Line 1: | Line 1: | ||
{{merge from|Semivariance|date=October 2013}} | |||
{{cleanup|date=June 2012|reason=Article is overly technical.}} | |||
In [[spatial statistics]] the theoretical '''variogram''' <math>2\gamma(x,y)</math> is a function describing the degree of spatial dependence of a spatial [[random field]] or [[stochastic process]] <math>Z(x)</math>. It is defined as the [[variance]] of the difference between field values at two locations (<math>x</math> and <math>y</math>) across realizations of the field (Cressie 1993): | |||
:<math>2\gamma(x,y)=\text{var}(Z(x) - Z(y)) = E\left(|(Z(x)-\mu(x))-(Z(y) - \mu(y))|^2\right). </math> | |||
If the spatial random field has constant mean <math>\mu</math>, this is equivalent to the expectation for the squared increment of the values between locations <math>x</math> and <math>y</math> (Wackernagel 2003) (where <math>x</math> and <math>y</math> are not coordinates but points in space): | |||
:<math>2\gamma(x,y)=E\left(|Z(x)-Z(y)|^2\right) , </math> | |||
where <math>\gamma(x,y)</math> itself is called the '''semivariogram'''. In the case of a [[stationary process]], the variogram and semivariogram can be represented as a function <math>\gamma_s(h)=\gamma(0,0+h)</math> of the difference <math>h=y-x</math> between locations only, by the following relation (Cressie 1993): | |||
:<math>\gamma(x,y)=\gamma_s(y-x).</math> | |||
If the process is furthermore [[isotropy|isotropic]], then the variogram and semivariogram can be represented by a function <math>\gamma_i(h):=\gamma_s(h e_1)</math> of the distance <math>h=\|y-x\|</math> only (Cressie 1993): | |||
:<math>\gamma(x,y)=\gamma_i(h).</math> | |||
The indexes <math>i</math> or <math>s</math> are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol <math>\gamma</math> is sometimes used for the variogram, which brings some confusion. | |||
==Properties== | |||
According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties: | |||
* The semivariogram is nonnegative <math>\gamma(x,y)\geq 0</math>, since it is the expectation of a square. | |||
* The semivariogram <math>\gamma(x,x)=\gamma_i(0)=E\left((Z(x)-Z(x))^2\right)=0</math> at distance 0 is always 0, since <math>Z(x)-Z(x)=0</math>. | |||
* A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights <math>w_1,\ldots,w_N</math> subject to <math>\sum_{i=1}^N w_i=0</math> and locations <math>x_1,\ldots,x_N</math> it holds:<blockquote><math>\sum_{i=1}^N\sum_{j=1}^N w_{i}\gamma(x_i,x_j)w_j \leq 0</math></blockquote>which corresponds to the fact that the variance <math>var(X)</math> of <math>X=\sum_{i=1}^N w_i Z(x_i)</math> is given by the negative of this double sum and must be nonnegative. | |||
* As a consequence the semivariogram might be non continuous only at the origin. The height of the jump at the origin is sometimes referred to as ''nugget'' or nugget effect. | |||
* If the [[covariance function]] of a stationary process exists it is related to variogram by<blockquote><math>2\gamma(x,y)=C(x,x)+C(y,y)-2C(x,y)</math></blockquote>For a non-stationary process the square of the difference between expected values at both points must be added:<blockquote><math>2\gamma(x,y)=C(x,x)+C(y,y)-2C(x,y) + (E(Z(x))-E(Z(y)))^2</math></blockquote> | |||
* If a stationary random field has no spatial dependence (i.e. <math>C(h)=0</math> if <math>h\not= 0</math>), the semivariogram is the constant <math>var(Z(x))</math> everywhere except at the origin, where it is zero. | |||
* <math>\gamma(x,y)=E(|Z(x)-Z(y)|^2)=\gamma(y,x)</math> is a symmetric function. | |||
* Consequently, <math>\gamma_s(h)=\gamma_s(-h)</math> is an [[even function]]. | |||
* If the random field is [[stationary process|stationary]] and [[ergodic]], the <math>\lim_{h\to \infty} \gamma_s(h) = var(Z(x))</math> corresponds to the variance of the field. The limit of the semivariogram is also called its ''sill''. | |||
==Empirical variogram== | |||
For observations <math>z_i,\;i=1,\ldots,k</math> at locations <math>x_1,\ldots,x_k</math> the empirical variogram <math>\hat{\gamma}(h)</math> is defined as (Cressie 1993): | |||
<math>\hat{\gamma}(h):=\frac{1}{|N(h)|}\sum_{(i,j)\in N(h)} |z_i-z_j|^2</math> | |||
where <math>N(h)</math> denotes the set of pairs of observations <math>i,\;j</math> such that <math>|x_i-x_j| = h</math>, and <math>|N(h)|</math> is the number of pairs in the set. (Generally an "approximate distance" <math>h</math> is used, implemented using a certain tolerance.) | |||
The ''empirical variogram'' is used in [[geostatistics]] as a first estimate of the (theoretical) variogram needed for spatial interpolation by [[kriging]]. | |||
According (Cressie 1993) for observations <math>z_i=Z(x_i)</math> from a [[Stationary process|stationary]] [[random field]] <math>Z(x)</math> the empirical variogram with lag tolerance 0 is an unbiased estimator of the theoretical variogram, due to: | |||
<math>E[\hat{\gamma}(h)]=\frac{1}{2|N(h)|}\sum_{(i,j)\in N(h)}E[|Z(x_i)-Z(x_j)|^2]=\frac{1}{2|N(h)|}\sum_{(i,j)\in N(h)}2\gamma(x_j-x_i)=\frac{2|N(h)|}{2|N(h)|}\gamma(h)</math> | |||
==Variogram parameters== | |||
The following parameters are often used to describe variograms: | |||
* ''nugget'' <math>n</math>: The height of the jump of the semivariogram at the discontinuity at the origin. | |||
* ''sill'' <math>s</math>: Limit of the variogram tending to infinity lag distances. | |||
* ''range'' <math>r</math>: The distance in which the difference of the variogram from the sill becomes negligible. In models with a fixed sill, it is the distance at which this is first reached; for models with an asymptotic sill, it is conventionally taken to be the distance when the semivariance first reaches 95% of the sill. | |||
==Variogram models== | |||
The empirical variogram cannot be computed at every lag distance <math>h</math> and due to variation in the estimation it is not ensured that it is a valid variogram, as defined above. However some [[geostatistics|Geostatistical]] methods such as [[kriging]] need valid semivariograms. In applied geostatistics the empirical variograms are thus often approximated by model function ensuring validity (Chiles&Delfiner 1999). Some important models are (Chiles&Delfiner 1999, Cressie 1993): | |||
* The exponential variogram model | |||
:: <math>\gamma(h)=(s-n)(1-\exp(-h/(ra)))+n 1_{(0,\infty)}(h)</math> | |||
* The spherical variogram model | |||
:: <math>\gamma(h)=(s-n)\left(\left(\frac{3h}{2r}-\frac{h^3}{2r^3}\right)1_{(0,r)}(h)+1_{[r,\infty)}(h)\right)+n1_{(0,\infty)}(h)</math> | |||
* The Gaussian variogram model | |||
:: <math>\gamma(h)=(s-n)\left(1-\exp\left(-\frac{h^2}{r^2a}\right)\right) + n1_{(0,\infty)}(h)</math> | |||
The parameter <math>a</math> has different values in different references, due to the ambiguity in the definition of the range. E.g. <math>a=1/3</math> is the value used in (Chiles&Delfiner 1999). The <math>1_A(h)</math> function is 1 if <math>h\in A</math> and 0 otherwise. | |||
==Discussion== | |||
Three functions are used in [[geostatistics]] for describing the spatial or the temporal correlation of observations: these are the [[correlogram]], the [[covariance]] and the '''semivariogram'''. The last is also more simply called '''variogram'''. The [[sampling variogram]], unlike the semivariogram and the variogram, shows where a significant degree of spatial dependence in the sample space or sampling unit dissipates into randomness when the variance terms of a temporally or ''in-situ'' ordered set are plotted against the variance of the set and the lower limits of its 99% and 95% confidence ranges. | |||
The variogram is the key function in [[geostatistics]] as it will be used to fit a model of the temporal/[[spatial correlation]] of the observed phenomenon. One is thus making a distinction between the ''experimental variogram'' that is a visualisation of a possible spatial/temporal correlation and the ''variogram model'' that is further used to define the weights of the [[kriging]] function. Note that the experimental variogram is an empirical estimate of the [[covariance]] of a [[Gaussian process]]. As such, it may not be [[positive definite]] and hence not directly usable in [[kriging]], without constraints or further processing. This explains why only a limited number of variogram models are used: most commonly, the linear, the spherical, the gaussian and the exponential models. | |||
When a variogram is used to describe the correlation of different variables it is called ''cross-variogram''. Cross-variograms are used in [[co-kriging]]. | |||
Should the variable be binary or represent classes of values, one is then talking about ''indicator variograms''. Indicator variogram is used in [[indicator kriging]]. | |||
== See also == | |||
* [[Covariance function]] | |||
* [[Semivariance]] | |||
==References== | |||
{{Reflist}} | |||
# Cressie, N., 1993, Statistics for spatial data, Wiley Interscience | |||
# Chiles, J. P., P. Delfiner, 1999, Geostatististics, Modelling Spatial Uncertainty, Wiley-Interscience | |||
# Wackernagel, H., 2003, Multivariate Geostatistics, Springer | |||
# Burrough, P A and McDonnell, R A, 1998, Principles of Geographical Information Systems | |||
# [http://www.kriging.com/pg1979_download.html Isobel Clark, 1979, Practical Geostatistics, Applied Science Publishers] | |||
==External links== | |||
* [http://www.ai-geostats.org/ AI-GEOSTATS: an educational resource about geostatistics and spatial statistics] | |||
* [http://www.kriging.com/PG1979/ Practical Geostatistics 1979 by Isobel Clark : an introduction to geostatistics] | |||
* [http://www.statistik.tuwien.ac.at/public/dutt/vorles/geost_05/geo.html Geostatistics: Lecture by Rudolf Dutter at the Technical University of Vienna] | |||
[[Category:Geostatistics]] | |||
[[Category:Statistical deviation and dispersion]] | |||
[[Category:Spatial processes]] |
Revision as of 16:46, 11 January 2014
Library Technician Anton from Strathroy, has many passions that include r/c helicopters, property developers in condo new launch singapore and coin collecting. Finds the beauty in planing a trip to spots around the globe, recently only returning from Old Town of Corfu. Template:Cleanup
In spatial statistics the theoretical variogram is a function describing the degree of spatial dependence of a spatial random field or stochastic process . It is defined as the variance of the difference between field values at two locations ( and ) across realizations of the field (Cressie 1993):
If the spatial random field has constant mean , this is equivalent to the expectation for the squared increment of the values between locations and (Wackernagel 2003) (where and are not coordinates but points in space):
where itself is called the semivariogram. In the case of a stationary process, the variogram and semivariogram can be represented as a function of the difference between locations only, by the following relation (Cressie 1993):
If the process is furthermore isotropic, then the variogram and semivariogram can be represented by a function of the distance only (Cressie 1993):
The indexes or are typically not written. The terms are used for all three forms of the function. Moreover, the term "variogram" is sometimes used to denote the semivariogram, and the symbol is sometimes used for the variogram, which brings some confusion.
Properties
According to (Cressie 1993, Chiles and Delfiner 1999, Wackernagel 2003) the theoretical variogram has the following properties:
- The semivariogram is nonnegative , since it is the expectation of a square.
- The semivariogram at distance 0 is always 0, since .
- A function is a semivariogram if and only if it is a conditionally negative definite function, i.e. for all weights subject to and locations it holds:
which corresponds to the fact that the variance of is given by the negative of this double sum and must be nonnegative.
- As a consequence the semivariogram might be non continuous only at the origin. The height of the jump at the origin is sometimes referred to as nugget or nugget effect.
- If the covariance function of a stationary process exists it is related to variogram by
For a non-stationary process the square of the difference between expected values at both points must be added:
- If a stationary random field has no spatial dependence (i.e. if ), the semivariogram is the constant everywhere except at the origin, where it is zero.
- is a symmetric function.
- Consequently, is an even function.
- If the random field is stationary and ergodic, the corresponds to the variance of the field. The limit of the semivariogram is also called its sill.
Empirical variogram
For observations at locations the empirical variogram is defined as (Cressie 1993):
where denotes the set of pairs of observations such that , and is the number of pairs in the set. (Generally an "approximate distance" is used, implemented using a certain tolerance.)
The empirical variogram is used in geostatistics as a first estimate of the (theoretical) variogram needed for spatial interpolation by kriging.
According (Cressie 1993) for observations from a stationary random field the empirical variogram with lag tolerance 0 is an unbiased estimator of the theoretical variogram, due to:
Variogram parameters
The following parameters are often used to describe variograms:
- nugget : The height of the jump of the semivariogram at the discontinuity at the origin.
- sill : Limit of the variogram tending to infinity lag distances.
- range : The distance in which the difference of the variogram from the sill becomes negligible. In models with a fixed sill, it is the distance at which this is first reached; for models with an asymptotic sill, it is conventionally taken to be the distance when the semivariance first reaches 95% of the sill.
Variogram models
The empirical variogram cannot be computed at every lag distance and due to variation in the estimation it is not ensured that it is a valid variogram, as defined above. However some Geostatistical methods such as kriging need valid semivariograms. In applied geostatistics the empirical variograms are thus often approximated by model function ensuring validity (Chiles&Delfiner 1999). Some important models are (Chiles&Delfiner 1999, Cressie 1993):
- The exponential variogram model
- The spherical variogram model
- The Gaussian variogram model
The parameter has different values in different references, due to the ambiguity in the definition of the range. E.g. is the value used in (Chiles&Delfiner 1999). The function is 1 if and 0 otherwise.
Discussion
Three functions are used in geostatistics for describing the spatial or the temporal correlation of observations: these are the correlogram, the covariance and the semivariogram. The last is also more simply called variogram. The sampling variogram, unlike the semivariogram and the variogram, shows where a significant degree of spatial dependence in the sample space or sampling unit dissipates into randomness when the variance terms of a temporally or in-situ ordered set are plotted against the variance of the set and the lower limits of its 99% and 95% confidence ranges.
The variogram is the key function in geostatistics as it will be used to fit a model of the temporal/spatial correlation of the observed phenomenon. One is thus making a distinction between the experimental variogram that is a visualisation of a possible spatial/temporal correlation and the variogram model that is further used to define the weights of the kriging function. Note that the experimental variogram is an empirical estimate of the covariance of a Gaussian process. As such, it may not be positive definite and hence not directly usable in kriging, without constraints or further processing. This explains why only a limited number of variogram models are used: most commonly, the linear, the spherical, the gaussian and the exponential models.
When a variogram is used to describe the correlation of different variables it is called cross-variogram. Cross-variograms are used in co-kriging. Should the variable be binary or represent classes of values, one is then talking about indicator variograms. Indicator variogram is used in indicator kriging.
See also
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- Cressie, N., 1993, Statistics for spatial data, Wiley Interscience
- Chiles, J. P., P. Delfiner, 1999, Geostatististics, Modelling Spatial Uncertainty, Wiley-Interscience
- Wackernagel, H., 2003, Multivariate Geostatistics, Springer
- Burrough, P A and McDonnell, R A, 1998, Principles of Geographical Information Systems
- Isobel Clark, 1979, Practical Geostatistics, Applied Science Publishers