Notes to self

Eigenvectors are vectors that only scale (i.e., change in magnitude, not direction) when a given linear transformation (represented by a matrix) is applied to them. The scaling factor by which an eigenvector is multiplied when the transformation is applied is called an eigenvalue.

Given a square matrix AA any vector vv is considered an eigenvector of AA if vv is not the zero vector and there is some scalar λ\lambda such that applying AA to vv results in a scalar multiple of vv, i.e., the direction of vv remains unchanged. In equation form, this is written as: Av=λvA \cdot v = \lambda \cdot v, where """\cdot" denotes the multiplication operation (either matrix multiplication or scalar multiplication, depending on context).

λ\lambda is the eigenvalue corresponding to the eigenvector vv in the above equation. It represents the scalar multiple by which the eigenvector is stretched or compressed (if you can't recall linear transformations you can refer Khan Academy's Matrix Transformations lecture for a refresher).

To find the eigenvalues of a matrix AA, we follow two steps. First we set up the characteristic equation, and then we solve for λ\lambda:

  • ☝️
    Characteristic Equation: You set up the equation det(AλI)=0det(A-\lambda \cdot I)=0, where detdet represents the determinant of a matrix, and II is the identity matrix of the same size as AA. This equation is derived from the eigenvector equation Av=λvA \cdot v= \lambda \cdot v and the fact that vv is non-zero.
  • ✌️
    Solve for λ\lambda: Solving the characteristic equation will give you the eigenvalues λ\lambda of the matrix AA.

Once the eigenvalues λ\lambda are known, the eigenvectors can be found by:

  • 👉
    Substitution: For each eigenvalue λλ, you substitute λ\lambda back into the equation Av=λvA \cdot v= \lambda \cdot v (which can be rewritten as (AλI)v=0)(A− \lambda \cdot I) \cdot v=0) and solve for vv.
  • 👉
    Solving the System: Typically, you'll get a system of linear equations for vv, which you'll need to solve. Any non-zero vector that satisfies the system of equations is considered an eigenvector corresponding to the eigenvalue λ\lambda.

Let's consider a 2x2 matrix A=[4123]A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}

  1. Characteristic Equation: First, we find the determinant of AλIA - \lambda \cdot I
det(AλI)=det([4123]λ[1001]) =det([4λ123λ]) =(4λ)(3λ)(2)(1)=λ27λ+10det(A - \lambda \cdot I) = det \begin{pmatrix} \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix} - \lambda \cdot \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \end{pmatrix} \\~\\ = det \begin{pmatrix} \begin{bmatrix} 4-\lambda & 1 \\ 2 & 3-\lambda \end{bmatrix} \end{pmatrix} \\~\\ = (4 \lambda) (3-\lambda) - (2)(1)=\lambda^2 - 7\lambda + 10
  1. Solving for λ\lambda: We solve λ27λ+10\lambda^2 - 7\lambda + 10 to find the eigenvalues. The solution to this quadratic equation are the eigenvalues of AA, which are λ=5\lambda = 5 and λ=2\lambda = 2.

Now comes the real magic. We can find the eigenvectors by plugging each eigenvalue into the equation (AλI)v=0(A− \lambda \cdot I) \cdot v=0 and solving for vv. For λ=5\lambda = 5:

[1122][v1v2]=[00] \begin{bmatrix} -1 & 1 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

The system simplifies to v1+v2=0-v_1 + v_2 = 0, so one eigenvector could be v=[1,1]v = [1, 1] for λ=5\lambda = 5.

[2121][v1v2]=[00] \begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix} \cdot \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

Similarly, for λ=2\lambda = 2, the system simplifies to 2v1+v2=02v_1 + v_2 = 0, so one eigenvector could be v=[1,2]v = [1, -2] for λ=2\lambda = 2

This process reveals the eigenvalues λ=5\lambda=5 and λ=2\lambda=2, with corresponding eigenvectors [1,1][1,1] and [1,2][1,−2], respectively. Each eigenvector is associated with one eigenvalue, and these vectors indicate the "directions" in which the linear transformation represented by matrix AA acts by stretching/compressing, without rotating.

Using numpy to find the eigenvalues and eigenvectors.

In this output, the eigenvalues are 5 and 2, which match the mathematical solution I calculated. The eigenvectors in numpy are normalized (i.e., their "unit length" of 1 in Euclidean space), so they may look different from the one I calculate by hand, but they are indeed pointing in the same directions. The first eigenvector is approximately [0.707,0.707][0.707, 0.707], which points in the same direction as [1,1][1,1], and the second eigenvector is approximately [0.447,0.894][−0.447, 0.894], which points in the same direction as [1,2][1,−2]. The direction is the critical property of the eigenvector, not the magnitude.

We can verify this by normalizing the vector. It involves dividing each component of the vector by its length. For example, suppose the vector is [1,2][1, -2].

First, we calculate the magnitude (mm) (Euclidean norm): m=(1)2+(2)2=1+4=5\small{m = \sqrt{(1)^2 + (-2)^2} = \sqrt{1+4} = \sqrt{5}} Then, divide each component of the original vector by this magnitude: normalized vector=[15,25]\text{normalized}\ \text{vector} = \begin{bmatrix} \frac{1}{\sqrt{5}},\frac{-2}{\sqrt{5}}\end{bmatrix}

Or just use numpy.

Both approaches will give us the same normalized vector:

Dassit 👋

Reading list

Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.