2-EXPTIME

From formulasearchengine
Revision as of 01:35, 11 April 2013 by en>EmausBot (Bot: Migrating 1 interwiki links, now provided by Wikidata on d:Q10844267)
Jump to navigation Jump to search

In mathematics and multivariate statistics, the centering matrix[1] is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.

Definition

The centering matrix of size n is defined as the n-by-n matrix

Cn=In1n𝕆

where In is the identity matrix of size n and 𝕆 is an n-by-n matrix of all 1's. This can also be written as:

Cn=In1n11

where 1 is the column-vector of n ones and where denotes matrix transpose.

For example

C1=[0],
C2=[1001]12[1111]=[12121212] ,
C3=[100010001]13[111111111]=[231313132313131323]

Properties

Given a column-vector, v of size n, the centering property of Cn can be expressed as

Cnv=v(1n1v)1

where 1n1v is the mean of the components of v.

Cn is symmetric positive semi-definite.

Cn is idempotent, so that Cnk=Cn, for k=1,2,. Once the mean has been removed, it is zero and removing it again has no effect.

Cn is singular. The effects of applying the transformation Cnv cannot be reversed.

Cn has the eigenvalue 1 of multiplicity n − 1 and eigenvalue 0 of multiplicity 1.

Cn has a nullspace of dimension 1, along the vector 1.

Cn is a projection matrix. That is, Cnv is a projection of v onto the (n − 1)-dimensional subspace that is orthogonal to the nullspace 1. (This is the subspace of all n-vectors whose components sum to zero.)

Application

Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an m-by-n matrix X, the multiplication CmX removes the means from each of the n columns, while XCn removes the means from each of the m rows.

The centering matrix provides in particular a succinct way to express the scatter matrix, S=(Xμ1)(Xμ1) of a data sample X, where μ=1nX1 is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as

S=XCn(XCn)=XCnCnX=XCnX.

Cn is the covariance matrix of the multinomial distribution, in the special case where the parameters of that distribution are k=n, and p1=p2==pn=1n.

References

  1. John I. Marden, Analyzing and Modeling Rank Data, Chapman & Hall, 1995, ISBN 0-412-99521-2, page 59.