|
|
Line 1: |
Line 1: |
| {{FeatureDetectionCompVisNavbox}}
| | Andera is what you can contact her but she never really favored that name. My working day occupation is an info officer but I've currently applied for another one. The preferred hobby for him and his kids is to play lacross and he'll be starting something else [http://Prayerarmor.com/uncategorized/dont-know-which-kind-of-hobby-to-take-up-read-the-following-tips/ free psychic readings] alongside with it. Her family members life in Ohio but her husband wants them to move.<br><br>Here is my web blog :: love [http://clothingcarearchworth.com/index.php?document_srl=441551&mid=customer_review email psychic readings] readings ([http://chungmuroresidence.com/xe/reservation_branch2/152663 simply click the following internet site]) |
| | |
| In mathematics, the '''structure [[tensor]]''', also referred to as the '''second-moment matrix''', is a [[matrix (mathematics)|matrix]] derived from the [[gradient]] of a [[function (mathematics)|function]]. It summarizes the predominant directions of the gradient in a specified neighborhood of a point, and the degree to which those directions are coherent. The structure tensor is often used in [[image processing]] and [[computer vision]].<ref name=bigun86>
| |
| J. Bigun and G. Granlund (1986), ''Optimal Orientation Detection of Linear Symmetry''. Tech. Report LiTH-ISY-I-0828, Computer Vision Laboratory, Linkoping University, Sweden 1986; Thesis Report, Linkoping studies in science and technology No. 85, 1986.
| |
| </ref><ref name=bigun87>
| |
| {{cite conference|author=J. Bigun and G. Granlund|title=Optimal Orientation Detection of Linear Symmetry|location=Piscataway|booktitle=First int. Conf. on Computer Vision, ICCV, (London) |publisher=IEEE Computer Society Press, Piscataway|pages=433–438|year=1987 }}
| |
| </ref><ref name=knutsson89>
| |
| {{cite conference|author=H. Knutsson|title=Representing local structure using tensors|location=Oulu|booktitle=Proceedings 6th Scandinavian Conf.
| |
| on Image Analysis|publisher=Oulu University|pages=244–251|year=1989}}
| |
| </ref>
| |
| | |
| ==The 2D structure tensor==
| |
| | |
| ===Continuous version===
| |
| For a function <math>I</math> of two variables ''p''=(''x'',''y''), the structure tensor is the 2×2 matrix
| |
| | |
| :<math>
| |
| S_w(p) =
| |
| \begin{bmatrix}
| |
| \int w(r) (I_x(p-r))^2\,d r & \int w(r) I_x(p-r)I_y(p-r)\,d r \\[10pt]
| |
| \int w(r) I_x(p-r)I_y(p-r)\,d r & \int w(r) (I_y(p-r))^2\,d r
| |
| \end{bmatrix}
| |
| </math>
| |
| where <math>I_x</math> and <math>I_y</math> are the [[partial derivative]]s of <math>I</math> with respect to ''x'' and ''y''; the integrals range over the plane <math>\mathbb{R}^2</math>; and ''w'' is some fixed "window function", a [[distribution (mathematics)|distribution]] on two variables. Note that the matrix ''S''<sub>''w''</sub> is itself a function of ''p''=(''x'',''y'').
| |
| | |
| The formula above can be written also as <math>S_w(p)=\int w(r) S_0(p-r)\,d r</math>, where <math>S_0</math> is the matrix-valued function defined by
| |
| :<math>
| |
| S_0(p)=
| |
| \begin{bmatrix}
| |
| (I_x(p))^2 & I_x(p)I_y(p) \\[10pt]
| |
| I_x(p)I_y(p) & (I_y(p))^2
| |
| \end{bmatrix}
| |
| </math>
| |
| | |
| If the [[gradient]] <math>\nabla I = (I_x,I_y)</math> of <math>I</math> is viewed as a 1×2 (single-row) matrix, the matrix <math>S_0</math> can be written as the [[matrix product]] <math>(\nabla I)'(\nabla I)</math>, where <math>(\nabla I)'</math> denotes the 2×1 (single-column) [[transpose]] of the gradient. (Note however that the structure tensor <math>S_w(p)</math> cannot be factored in this way.)
| |
| | |
| ===Discrete version===
| |
| In image processing and other similar applications, the function <math>I</math> is usually given as a discrete [[array data structure|array]] of samples <math>I[p]</math>, where ''p'' is a pair of integer indices. The 2D structure tensor at a given [[pixel]] is usually taken to be the discrete sum
| |
| | |
| :<math>
| |
| S_w[p] =
| |
| \begin{bmatrix}
| |
| \sum_r w[r] (I_x[p-r])^2 & \sum_r w[r] I_x[p-r]I_y[p-r] \\[10pt]
| |
| \sum_r w[r] I_x[p-r]I_y[p-r] & \sum_r w[r] (I_y[p-r])^2
| |
| \end{bmatrix}
| |
| </math>
| |
| | |
| Here the summation index ''r'' ranges over a finite set of index pairs (the "window", typically <math>\{-m..+m\}\times\{-m..+m\}</math> for some ''m''), and ''w''[''r''] is a fixed "window weight" that depends on ''r'', such that the sum of all weights is 1. The values <math>I_x[p],I_y[p]</math> are the partial derivatives sampled at pixel ''p''; which, for instance, may be estimated from by <math>I</math> by [[finite difference]] formulas.
| |
| | |
| The formula of the structure tensor can be written also as <math>S_w[p]=\sum_r w[r] S_0[p-r]</math>, where <math>S_0</math> is the matrix-valued array such that
| |
| :<math>
| |
| S_0[p] =
| |
| \begin{bmatrix}
| |
| (I_x[p])^2 & I_x[p]I_y[p] \\[10pt]
| |
| I_x[p]I_y[p] & (I_y[p])^2
| |
| \end{bmatrix}
| |
| </math>
| |
| | |
| ===Interpretation===
| |
| The importance of the 2D structure tensor <math>S_w</math> stems from the fact that its [[eigenvalue]]s <math>\lambda_1,\lambda_2</math> (which can be ordered so that <math>\lambda_1 \geq \lambda_2\geq 0</math>) and the corresponding [[eigenvector]]s <math>e_1,e_2</math> summarize the distribution of the [[gradient]] <math>\nabla I = (I_x,I_y)</math> of <math>I</math> within the window defined by <math>w</math> centered at <math>p</math>.<ref name=bigun86/><ref name=bigun87/><ref name=knutsson89/>
| |
| | |
| Namely, if <math>\lambda_1 > \lambda_2</math>, then <math>e_1</math> (or <math>-e_1</math>) is the direction that is maximally aligned with the gradient within the window. In particular, if <math>\lambda_1 > 0, \lambda_2 = 0</math> then the gradient is always a multiple of <math>e_1</math> (positive, negative or zero); this is the case if and only if <math>I</math> within the window varies along the direction <math>e_1</math> but is constant along <math>e_2</math>.
| |
| | |
| If <math>\lambda_1 = \lambda_2</math>, on the other hand, the gradient in the window has no predominant direction; which happens, for instance, when the image has [[rotational symmetry]] within that window. In particular, <math>\lambda_1 = \lambda_2 = 0</math> if and only if the function <math>I</math> is constant (<math>\nabla I = (0,0)</math>) within <math>W</math>.
| |
| | |
| More generally, the value of <math>\lambda_k </math>, for ''k''=1 or ''k''=2, is the <math>w</math>-weighted average, in the neighborhood of ''p'', of the square of the [[directional derivative]] of <math>I</math> along <math>e_k</math>. The relative discrepancy between the two eigenvalues of <math>S_w</math> is an indicator of the degree of [[isotropy|anisotropy]] of the gradient in the window, namely how strongly is it biased towards a particular direction (and its opposite).<ref name="Jahne1993">
| |
| {{cite book|author=B. Jahne|title=Spatio-Temporal Image Processing: Theory and Scientific Applications|location=Berlin|publisher=Springer-Verlag|volume=751|year=1993}}
| |
| </ref><ref name=MedioniEA>
| |
| {{cite book|author=G. Medioni, M. Lee and C. Tang|title=A Computational Framework for Feature Extraction and Segmentation|publisher=Elsevier Science|date=March 2000}}
| |
| </ref> This attribute can be quantified by the '''coherence''', defined as
| |
| | |
| :<math>c_w=\left(\frac{\lambda_1-\lambda_2}{\lambda_1+\lambda_2}\right)^2</math>
| |
| | |
| if <math>\lambda_2>0</math>. This quantity is 1 when the gradient is totally aligned, and 0 when it has no preferred direction. The formula is undefined, even in the [[limit]], when the image is constant in the window (<math>\lambda_1=\lambda_2=0</math>). Some authors define it as 0 in that case.
| |
| | |
| Note that the average of the gradient <math>\nabla I</math> inside the window is '''not''' a good indicator of anisotropy. Aligned but oppositely oriented gradient vectors would cancel out in this average, whereas in the structure tensor they are properly added together.<ref>
| |
| {{cite journal|author=T. Brox, J. Weickert, B. Burgeth and P. Mrazek|title=Nonlinear Structure Tensors|booktitle=Universitat des Saarlandes, Tech. Report|issue=113|pages=1–32|year=2004}}
| |
| </ref>
| |
| | |
| By expanding the effective radius of the window function <math>w</math> (that is, increasing its variance), one can make the structure tensor more robust in the face of noise, at the cost of diminished spatial resolution.<ref name=MedioniEA /><ref name=lin94book>T. Lindeberg (1994), ''[http://www.nada.kth.se/~tony/book.html Scale-Space Theory in Computer Vision]''. Kluwer Academic Publishers, (see sections 14.4.1 and 14.2.3 on pages 359-360 and 355-356 for detailed statements about how the multi-scale second-moment matrix/structure tensor defines a true and uniquely determined multi-scale representation of directional data).
| |
| </ref> The formal basis for this property is described in more detail below, where it is shown that a multi-scale formulation of the structure tensor, referred to as the [[Structure tensor#The multi-scale structure tensor|multi-scale structure tensor]], constitutes a ''true multi-scale representation of directional data under variations of the spatial extent of the window function''.
| |
| | |
| ==The 3D structure tensor==
| |
| | |
| ===Definition===
| |
| The structure tensor can be defined also for a function <math>I</math> of three variables ''p''=(''x'',''y'',''z'') in an entirely analogous way. Namely, in the continuous version we have <math>S_w(p) = \int w(r) S_0(p-r)\,d r</math>, where
| |
| :<math>
| |
| S_0(p) =
| |
| \begin{bmatrix}
| |
| (I_x(p))^2 & I_x(p)I_y(p) & I_x(p)I_z(p) \\[10pt]
| |
| I_x(p)I_y(p) & (I_y(p))^2 & I_y(p)I_z(p) \\[10pt]
| |
| I_x(p)I_z(p) & I_y(p)I_z(p) & (I_z(p))^2
| |
| \end{bmatrix}
| |
| </math>
| |
| where <math>I_x,I_y,I_z</math> are the three partial derivatives of <math>I</math>, and the integral ranges over <math>\mathbb{R}^3</math>.
| |
| | |
| In the discrete version,<math>S_w[p]=\sum_r w[r] S_0[p-r]</math>, where
| |
| :<math>
| |
| S_0[p] =
| |
| \begin{bmatrix}
| |
| (I_x[p])^2 & I_x[p]I_y[p] & I_x[p]I_z[p] \\[10pt]
| |
| I_x[p]I_y[p] & (I_y[p])^2 & I_y[p]I_z[p]\\[10pt]
| |
| I_x[p]I_z[p] & I_y[p]I_z[p] & (I_z[p])^2
| |
| \end{bmatrix}
| |
| </math>
| |
| and the sum ranges over a finite set of 3D indices, usually <math>\{-m..+m\}\times\{-m..+m\}\times\{-m..+m\}</math> for some ''m''.
| |
| | |
| ===Interpretation===
| |
| As in the two-dimensional case, the eigenvalues <math>\lambda_1,\lambda_2,\lambda_3</math> of <math>S_w[p]</math>, and the corresponding eigenvectors <math>e_1,e_2,e_3</math>, summarize the distribution of gradient directions within the neighborhood of ''p'' defined by the window <math>w</math>. This information can be visualized as an [[ellipsoid]] whose semi-axes are equal to the eigenvalues and directed along their corresponding eigenvectors.<ref name="Medioni"/>
| |
| | |
| [[Image:STgeneric.png|thumb|center|240px|Ellipsoidal representation of the 3D structure tensor.]]
| |
| | |
| In particular, if the ellipsoid is stretched along one axis only, like a cigar (that is, if <math>\lambda_1</math> is much larger than both <math>\lambda_2</math> and <math>\lambda_3</math>), it means that the gradient in the window is predominantly aligned with the direction <math>e_1</math>, so that the [[isosurface]]s of <math>I</math> tend to be flat and perpendicular to that vector. This situation occurs, for instance, when ''p'' lies on a thin plate-like feature, or on the smooth boundary between two regions with contrasting values.
| |
| | |
| <center>
| |
| <table cellborder=0px border=0px>
| |
| <tr valign=top>
| |
| <td>[[Image:STsurfel.png|thumb|180px|The structure tensor ellipsoid of a surface-like neighborhood ("[[surfel]]"), where <math>\lambda_1 >\!> \lambda_2 \approx \lambda_3</math>.]]</td>
| |
| <td>[[Image:StepPlane3D.png|thumb|180px|A 3D window straddling a smooth boundary surface between two uniform regions of a 3D image.]]</td>
| |
| <td>[[Image:StepPlane3DST.png|thumb|180px|The corresponding structure tensor ellipsoid.]]</td>
| |
| </tr>
| |
| </table>
| |
| </center>
| |
| | |
| If the ellipsoid is flattened in one direction only, like a pancake (that is, if <math>\lambda_3</math> is much smaller than both <math>\lambda_1</math> and <math>\lambda_2</math>), it means that the gradient directions are spread out but perpendicular to <math>e_3</math>; so that the isosurfaces tend to be like tubes parallel to that vector. This situation occurs, for instance, when ''p'' lies on a thin line-like feature, or on a sharp corner of the boundary between two regions with contrasting values.
| |
| | |
| <center>
| |
| <table cellborder=0px border=0px>
| |
| <tr valign=top>
| |
| <td>[[Image:STcurvel.png|thumb|180px|The structure tensor of a line-like neighborhood ("curvel"), where <math>\lambda_1 \approx \lambda_2 >\!> \lambda_3</math>.]]</td>
| |
| <td>[[Image:curve3D.png|thumb|180px|A 3D window straddling a line-like feature of a 3D image.]]</td>
| |
| <td>[[Image:curve3DST.png|thumb|180px|The corresponding structure tensor ellipsoid.]]</td>
| |
| </tr>
| |
| </table>
| |
| </center>
| |
| | |
| Finally, if the ellipsoid is roughly spherical (that is, if <math>\lambda_1\approx\lambda_2\approx\lambda_3</math>), it means that the gradient directions in the window are more or less evenly distributed, with no marked preference; so that the function <math>I</math> is mostly isotropic in that neighborhood. This happens, for instance, when the function has [[spherical symmetry]] in the neighborhood of ''p''. In particular, if the ellipsoid degenerates to a point (that is, if the three eigenvalues are zero), it means that <math>I</math> is constant (has zero gradient) within the window.
| |
| | |
| <center>
| |
| <table cellborder=0px border=0px>
| |
| <tr valign=top>
| |
| <td>[[Image:STball.png|thumb|180px|The structure tensor in an isotropic neighborhood, where <math>\lambda_1\approx\lambda_2\approx\lambda_3</math>.]]</td>
| |
| <td>[[Image:Sphere3D.png|thumb|180px|A 3D window containing a spherical feature of a 3D image.]]</td>
| |
| <td>[[Image:Sphere3DST.png|thumb|180px|The corresponding structure tensor ellipsoid.]]</td>
| |
| </tr>
| |
| </table>
| |
| </center>
| |
| | |
| ==The multi-scale structure tensor==
| |
| The structure tensor is an important tool in [[scale space]] analysis. The '''multi-scale structure tensor''' (or '''multi-scale second moment matrix''') of a function <math>I</math> is in contrast to other one-parameter scale-space features an image descriptor that is defined over ''two'' scale parameters.
| |
| One scale parameter, referred to as ''local scale'' <math>t</math>, is needed for determining the amount of pre-smoothing when computing the image gradient <math>(\nabla I)(x; t)</math>. Another scale parameter, referred to as ''integration scale'' <math>s</math>, is needed for specifying the spatial extent of the window function <math>w(\xi; s)</math> that determines the weights for the region in space over which the components of the outer product of the gradient by itself <math>(\nabla I)(\nabla I)^T</math> are accumulated.
| |
| | |
| More precisely, suppose that <math>I</math> is a real-valued signal defined over <math>\mathbb{R}^k</math>. For any local scale <math>t > 0</math>, let a multi-scale representation <math>I(x; t)</math> of this signal be given by <math>I(x; t) = h(x; t)*I(x)</math> where <math>h(x; t)</math> represents a pre-smoothing kernel. Furthermore, let <math>(\nabla I)(x; t)</math> denote the gradient of the [[scale space representation]].
| |
| Then, the ''multi-scale structure tensor/second-moment matrix'' is defined by
| |
| <ref name=lin94book/><ref name=lingar97>{{cite journal
| |
| | author=T. Lindeberg and J. Garding
| |
| | title=Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure
| |
| | journal=Image and Vision Computing
| |
| | year=1997
| |
| | volume=15
| |
| | pages=pp 415–434
| |
| | url=http://www.nada.kth.se/~tony/abstracts/LG94-ECCV.html
| |
| | doi=10.1016/S0262-8856(97)01144-X
| |
| | issue=6
| |
| }}</ref><ref name=garlin96>
| |
| J. Garding and T. Lindeberg (1996). ''[http://www.nada.kth.se/cvap/abstracts/cvap117.html "Direct computation of shape cues using scale-adapted spatial derivative operators]'', International Journal of Computer Vision, volume 17, issue 2, pages 163--191.
| |
| </ref>
| |
| :<math>
| |
| \mu(x; t, s) =
| |
| \int_{\xi \in \mathbb{R}^k}
| |
| (\nabla I)(x-\xi; t) \, (\nabla I)^T(x-\xi; t) \,
| |
| w(\xi; s) \, d\xi
| |
| </math>
| |
| Conceptually, one may ask if it would be sufficient to use any self-similar families of smoothing functions <math>h(x; t)</math> and <math>w(\xi; s)</math>. If one naively would apply, for example, a box filter, however, then non-desirable artifacts could easily occur. If one wants the multi-scale structure tensor to be well-behaved over both increasing local scales <math>t</math> and increasing integration scales <math>s</math>, then it can be shown that both the smoothing function and the window function ''have to'' be Gaussian.<ref name=lin94book/> The conditions that specify this uniqueness are similar to the [[scale-space axioms]] that are used for deriving the uniqueness of the Gaussian kernel for a regular Gaussian [[scale space]] of image intensities.
| |
| | |
| There are different ways of handling the two-parameter scale variations in this family of image descriptors. If we keep the local scale parameter <math>t</math> fixed and apply increasingly broadened versions of the window function by increasing the integration scale parameter <math>s</math> only, then we obtain a ''true formal [[scale space representation]] of the directional data computed at the given local scale'' <math>t</math>.<ref name=lin94book/> If we couple the local scale and integration scale by a ''relative integration scale'' <math>r \geq 1</math>, such that <math>s = r t</math> then for any fixed value of <math>r</math>, we obtain a reduced self-similar one-parameter variation, which is frequently used to simplify computational algorithms, for example in [[corner detection]], [[interest point detection]], [[texture analysis]] and [[image registration|image matching]].
| |
| By varying the relative integration scale <math>r \geq 1</math> in such a self-similar scale variation, we obtain another alternative way of parameterizing the multi-scale nature of directional data obtained by increasing the integration scale.
| |
| | |
| A conceptually similar construction can be performed for discrete signals, with the convolution integral replaced by a convolution sum and with the continuous Gaussian kernel <math> g(x; t)</math> replaced by the [[discrete Gaussian kernel]] <math>T(n; t)</math>:
| |
| :<math>
| |
| \mu(x; t, s) =
| |
| \sum_{n \in \mathbb{Z}^k}
| |
| (\nabla I)(x-n; t) \, (\nabla I)^T(x-n; t) \,
| |
| w(n; s)
| |
| </math>
| |
| When quantizing the scale parameters <math>t</math> and <math>s</math> in an actual implementation, a finite geometric progression <math>\alpha^i</math> is usually used, with ''i'' ranging from 0 to some maximum scale index ''m''. Thus, the discrete scale levels will bear certain similarities to [[pyramid (image processing)|image pyramid]], although spatial subsampling may not necessarily be used in order to preserve more accurate data for subsequent processing stages.
| |
| | |
| ==Applications==
| |
| The eigenvalues of the structure tensor play a significant role in many image processing algorithms, for problems like [[corner detection]], [[interest point detection]], and [[feature tracking]].<ref name="Medioni">
| |
| {{cite conference|author=M. Nicolescu and G. Medioni |title=Motion Segmentation with Accurate Boundaries — A Tensor Voting Approach|booktitle=Proc. IEEE Computer Vision and Pattern Recognition|volume=1|pages=382–389|year=2003}}
| |
| </ref><ref> | |
| {{cite journal
| |
| |author=W. Förstner|title=A Feature Based Correspondence Algorithm for Image Processing
| |
| |booktitle=International Archives of Photogrammetry and Remote Sensing|volume=26|pages=150–166|year=1986}}
| |
| </ref><ref>
| |
| {{cite conference|author=C. Harris and M. Stephens|title=A Combined Corner and Edge Detector
| |
| |booktitle=Proc. of the 4th ALVEY Vision Conference|pages=147–151|year=1988}}
| |
| </ref><ref>
| |
| {{cite journal|author=K. Rohr|title=On 3D Differential Operators for Detecting Point Landmarks
| |
| |booktitle=Image and Vision Computing|volume=15|issue=3|pages=219–233|year=1997}}
| |
| </ref><ref>
| |
| {{cite conference|author=I. Laptev and T. Lindeberg|title=Space-time interest points
| |
| |booktitle=International Conference on Computer Vision ICCV'03|url=ftp://ftp.nada.kth.se/CVAP/reports/LapLin03-ICCV.pdf|doi=10.1109/ICCV.2003.1238378|pages=432–439|volume=I|year=2003}}
| |
| </ref><ref>
| |
| {{cite conference|author=B. Triggs|title=Detecting Keypoints with Stable Position, Orientation, and Scale under Illumination Changes
| |
| |booktitle=Proc. European Conference on Computer Vision|volume=4|pages=100–113|year=2004}}
| |
| </ref><ref>
| |
| {{cite conference|author=C. Kenney, M. Zuliani and B. Manjunath, |title=An Axiomatic Approach to Corner Detection|booktitle=Proc. IEEE Computer Vision and Pattern Recognition|pages=191–197|year=2005}}
| |
| </ref> The structure tensor also plays a central role in the [[Lucas–Kanade Optical Flow Method|Lucas-Kanade optical flow algorithm]], and in its extensions to estimate [[affine shape adaptation]];<ref name=lingar97/> where the magnitude of <math>\lambda_2</math> is an indicator of the reliability of the computed result. The tensor has also been used for [[scale space]] analysis,<ref name=lin94book/> estimation of local surface orientation from monocular or binocular cues,<ref name=garlin96/> non-linear [[fingerprint enhancement]],<ref>
| |
| A. Almansa and T. Lindeberg (2000), ''[http://www.nada.kth.se/cvap/abstracts/cvap226.html Enhancement of fingerprint images using shape-adaptated scale-space operators]''. IEEE Transactions on Image Processing, volume 9, number 12, pages 2027-2042.
| |
| </ref> [[diffusion-based image processing]],<ref>[http://www.mia.uni-saarland.de/weickert/book.html J. Weickert (1998), Anisotropic diffusion in image processing, Teuber Verlag, Stuttgart.]</ref><ref>
| |
| {{cite journal|author=D. Tschumperle and Deriche|title=Diffusion PDE's on Vector-Valued Images|booktitle=IEEE Signal Processing Magazine|pages=16–25|date=September 2002}}
| |
| </ref><ref>
| |
| {{cite conference|author=S. Arseneau and J. Cooperstock|title=An Asymmetrical Diffusion Framework for Junction Analysis|booktitle=British Machine Vision Conference|volume=2|pages=689–698|date=September 2006}}
| |
| </ref><ref>
| |
| {{cite conference|author=S. Arseneau, and J. Cooperstock|title=An Improved Representation of Junctions through Asymmetric Tensor Diffusion|booktitle=International Symposium on Visual Computing|date=November 2006}}
| |
| </ref> and several other image processing problems.
| |
| | |
| ===Processing spatio-temporal video data with the structure tensor===
| |
| | |
| The three-dimensional structure tensor has been used to analyze three-dimensional video data (viewed as a function of ''x'', ''y'', and time ''t'').<ref name="Jahne1993" />
| |
| If one in this context aims at image descriptors that are ''invariant'' under Galilean transformations, to make it possible to compare image measurements that have been obtained under variations of a priori unknown image velocities <math>v = (v_x, v_y)^T</math>
| |
| :<math> \begin{bmatrix} x' \\ y' \\ t' \end{bmatrix} = G \begin{bmatrix} x \\ y \\ t \end{bmatrix} = G \begin{bmatrix} x - v_x \, t \\ y - v_y \, t \\ t \end{bmatrix} </math>,
| |
| it is, however, from a computational viewpoint more preferable to parameterize the components in the structure tensor/second-moment matrix <math>S</math> using the notion of ''Galilean diagonalization''<ref name=lin04icpr>
| |
| {{cite conference|author=T. Lindeberg, A. Akbarzadeh, and I. Laptev|title=Galilean-corrected spatio-temporal interest operators|booktitle=International Conference on Pattern Recognition ICPR'04|url=ftp://ftp.nada.kth.se/CVAP/reports/LinAkhLap04-ICPR.pdf|doi=10.1109/ICPR.2004.1334004|date=August 2004|volume=I| pages=57–62}}
| |
| </ref>
| |
| :<math> S' = R_{space}^{-T} \, G^{-T} \, S \, G^{-1} \, R_{space}^{-1} = \begin{bmatrix} \nu_1 & \, & \, \\ \, & \nu_2 & \, \\ \, & \, & \nu_3 \end{bmatrix} </math>
| |
| where <math>G</math> denotes a Galilean transformation of space-time and <math>R_{space}</math> a two-dimensional rotation over the spatial domain,
| |
| compared to the abovementioned use of eigenvalues of a 3-D structure tensor, which corresponds to an eigenvalue decomposition and a (non-physical) three-dimensional rotation of space-time
| |
| :<math> S'' = R_{space-time}^{-T} \, S \, R_{space -time}^{-1} = \begin{bmatrix} \lambda_1 & & \\ & \lambda_2 & \\ & & \lambda_3 \end{bmatrix} </math>.
| |
| To obtain true Galilean invariance, however, also the shape of the spatio-temporal window function needs to be adapted,<ref name=lin04icpr/><ref>
| |
| {{cite conference|author=I. Laptev, and T. Lindeberg|title=Velocity adaptation of space-time interest points|booktitle=International Conference on Pattern Recognition ICPR'04|url=http://www.csc.kth.se/cvap/abstracts/LapLin04-ICPR.html|doi=10.1109/ICPR.2004.971|date=August 2004|volume=I| pages=52–56}}
| |
| </ref> corresponding to the transfer of [[affine shape adaptation]]<ref name=lingar97/> from spatial to spatio-temporal image data.
| |
| In combination with local spatio-temporal histogram descriptors,<ref>
| |
| {{cite conference|author=I. Laptev, and T. Lindeberg|title=Local descriptors for spatio-temporal recognition|booktitle=ECCV'04 Workshop on Spatial Coherence for Visual Motion Analysis (Prague, Czech Republic) Springer Lecture Notes in Computer Science|url=http://www.csc.kth.se/cvap/abstracts/LapLin04-ECCVWS.html|doi=10.1007/11676959|date=May 2004|volume=3667| pages=91–103.}}
| |
| </ref>
| |
| these concepts together allow for Galilean invariant recognition of spatio-temporal events.<ref>
| |
| {{cite conference|author=I. Laptev, B. Caputo, C. Schuldt, and T. Lindeberg|title=Local velocity-adapted motion events for spatio-temporal recognition|booktitle=Computer Vision and Image Understanding|url=http://www.csc.kth.se/cvap/abstracts/LapCapSchLin07-CVIU.html|doi=10.1016/j.cviu.2006.11.023|year=2007|volume=108| pages= 207–229}}</ref>
| |
| | |
| ==See also==
| |
| *[[Tensor]]
| |
| *[[Directional derivative]]
| |
| *[[Gaussian]]
| |
| *[[Corner detection]]
| |
| *[[Edge detection]]
| |
| *[[Lucas Kanade method|Lucas-Kanade method]]
| |
| *[[Affine shape adaptation]]
| |
| | |
| ==References==
| |
| <references/>
| |
| | |
| ==Resources==
| |
| *[http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=12362&objectType=FILE Download MATLAB Source]
| |
| *[http://www.cs.cmu.edu/~sarsen/structureTensorTutorial/ Structure Tensor Tutorial (Original)]
| |
| | |
| {{DEFAULTSORT:Structure Tensor}}
| |
| [[Category:Tensors]]
| |
| [[Category:Feature detection]]
| |