Chapter 25 Spectral Decompostion

Spectral decomposition (a.k.a., eigen decomposition) is used primarily in principal components analysis (PCA). This method decomposes a square matrix, A, into the product of three matrices:

\[ \underset{n\times n}{\mathbf{A}} = \underset{n\times n}{\mathbf{P}}~ \underset{n\times n}{\mathbf{D}}~ \underset{n\times n}{\mathbf{P}^{\intercal}} \]

where, P is a n-dimensional square matrix whose ith column is the ith eigenvector of A, and D is a n-dimensional diagonal matrix whose diagonal elements are composed of the eigenvalues of A. That is, the spectral decomposition is based on the eigenstructure of A.


25.0.1 An Example

Recall that in a previous chapter we used the following \(2 \times 2\) matrix as an example:

\[ \mathbf{A} = \begin{bmatrix} -3 & 5 \\ 4 & -2 \\ \end{bmatrix} \]

The eigenstructure for this matrix was:

\[ \begin{split} \lambda_1 &= -7 \qquad &\mathbf{e}_1 = \begin{bmatrix}\frac{5}{\sqrt{41}} \\ -\frac{4}{\sqrt{41}}\end{bmatrix}\\[2ex] \lambda_2 &= 2 \qquad &\mathbf{e}_2 = \begin{bmatrix}\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}\end{bmatrix} \\[2ex] \end{split}\]

The P and D matrices of the spectral decomposition are composed of the eigenvectors and eigenvalues, respectively.

\[ \begin{split} \mathbf{P} &= \begin{bmatrix}\frac{5}{\sqrt{41}} & \frac{1}{\sqrt{2}} \\ -\frac{4}{\sqrt{41}} & \frac{1}{\sqrt{2}}\end{bmatrix} \\[2ex] \mathbf{D} &= \begin{bmatrix}7 & 0 \\ 0 & -2\end{bmatrix} \end{split} \]

Recall also that the eigen() function provided the eigenvalues and eigenvectors for an inputted square matrix. The eigenvectors were outputted as columns in a matrix, so, the $vector output from the function is, in fact, outputting the matrix P. The eigen() function is actually carrying out the spectral decomposition! We can use this output to verify the decomposition by computing whether \(\mathbf{PDP}^{-1}=\mathbf{A}\).

# Create A
A = matrix(
  data = c(-3, 4, 5, -2), 
  nrow = 2
  )

# Compute eigenvalues and eigenvectors
spec_decomp = eigen(A)

# Create P
P = spec_decomp$vectors

# Create D
D = diag(spec_decomp$values)

# Verify the decomposition
P %*% D %*% solve(P)
     [,1] [,2]
[1,]   -3    5
[2,]    4   -2

Of note, when A is symmetric, then the P matrix will be orthogonal; \(\mathbf{P}^{-1}=\mathbf{P}^\intercal\). We can illustrate this by an example:

# Create symmetric matrix A
A = matrix(
  data = c(1, 4, 2, 4, 1, 3, 2, 3, 1), 
  nrow = 3
  )

# Compute eigenvalues and eigenvectors
spec_decomp = eigen(A)

# Create P
P = spec_decomp$vectors

# Inverse of P
solve(P)
        [,1]    [,2]    [,3]
[1,] -0.5844 -0.6346 -0.5058
[2,] -0.5449 -0.1550  0.8240
[3,]  0.6013 -0.7572  0.2552
# Transpose of P
t(P)
        [,1]    [,2]    [,3]
[1,] -0.5844 -0.6346 -0.5058
[2,] -0.5449 -0.1550  0.8240
[3,]  0.6013 -0.7572  0.2552

This is a useful property since it means that the inverse of P is easy to compute.


25.1 Solving Systems of Equations with Spectral Decomposition

We can use spectral decomposition to more easily solve systems of equations. For example, in OLS estimation, our goal is to solve the following for b.

\[ (\mathbf{X}^{\intercal}\mathbf{X})\mathbf{b} = \mathbf{X}^{\intercal}\mathbf{y} \]

Since \((\mathbf{X}^{\intercal}\mathbf{X})\) is a square, symmetric matrix, we can decompose it into \(\mathbf{PDP}^\intercal\). Thus,

\[ \mathbf{PDP}^{\intercal}\mathbf{b} = \mathbf{X}^{\intercal}\mathbf{y} \] Solving for b, we find:

\[ \begin{split} \big(\mathbf{PDP}^{\intercal}\big)^{-1}\mathbf{PDP}^{\intercal}\mathbf{b} &= \big(\mathbf{PDP}^{\intercal}\big)^{-1} \mathbf{X}^{\intercal}\mathbf{y} \\[2ex] \mathbf{b} &= (\mathbf{P}^\intercal)^{-1}\mathbf{D}^{-1}\mathbf{P}^{-1}\mathbf{X}^{\intercal}\mathbf{y} \\[2ex] &= \mathbf{P} \mathbf{D}^{-1}\mathbf{P}^\intercal\mathbf{X}^{\intercal}\mathbf{y} \end{split} \]

The orthogonal P matrix makes this computationally easier to solve. Moreover, since D is a diagonal matrix, \(\mathbf{D}^{-1}\) is also easy to compute. Namely, \(\mathbf{D}^{-1}\) is also diagonal with elements on the diagonal equal to \(\frac{1}{\lambda_i}\).

Consider our ongoing simulated data:

# Number of cases
n = 50

# Create 50 x-values evenly spread b/w 1 and 500 
x = seq(from = 1, to = 500, len = n)

# Create X matrix
X = cbind(1, x, x^2, x^3)

# Create b matrix
b = matrix(
  data = c(1, 1, 1, 1), 
  nrow = 4
  )

# Create vector of y-values
set.seed(1)
y = X %*% b + rnorm(n, mean = 0, sd = 1)

We start by using spectral decomposition to decompose \(\mathbf{X}^\intercal\mathbf{X}\).

# Spectral decomposition of (X^T)X
spec_decomp = eigen(t(X) %*% X)

# Create P and D^{-1}
P = spec_decomp$vectors
D_inv = diag(1/spec_decomp$values)

Now we can carry out the matrix algebra to compute b.

# Compute b
P %*% D_inv %*% t(P) %*% t(X) %*% y
       [,1]
[1,] 0.9278
[2,] 1.0059
[3,] 1.0000
[4,] 1.0000