Hey there! You've probably encountered matrices if you've ever dabbled in mathematics or computer science. You'd know that we can add, subtract, and even multiply these wonderful rectangular arrays of numbers together. Matrix multiplication, in particular, has a wide range of applications, from computer graphics to machine learning. But did you know there are more efficient ways than how we usually learn to multiply matrices?

That's where Strassen's algorithm comes in. Invented by Volker Strassen in 1969, this algorithm revolutionized how we think about matrix multiplication by showing us a faster way to do it. If you're like me, you might wonder: "How does it achieve this?" Well, that's what we're going to explore in this article.

Before diving into the deep end with Strassen's algorithm, let's dip our toes in the shallow end firstunderstanding basic matrix multiplication and its constraints.

Matrix Multiplication 101

Imagine you have two groups of people. The first group is handing out apples, and the second is counting them. Matrices are like that. In matrix multiplication, we distribute and count things based on specific rules. Each number in one matrix corresponds with a number in the other matrix, and we're adding up the results of their multiplication. But, unlike in basic multiplication, the order matters. Multiplying matrix AA by matrix BB is not the same as multiplying matrix BB by matrix AA. If you've ever tried to put on your right shoe before your left shoe while standing up, you'll understand why order matters!

But, just like you can't start handing out apples before they're ripe, there are a few rules we have to follow before we can multiply matrices. The most important rule is that the number of columns in the first matrix must be equal to the number of rows in the second matrix. In other words, if we have a matrix AA of dimensions m×nm \times n and another matrix BB of dimensions p×qp \times q, we can only multiply AA and BB if n=pn=p. If this condition isn't met, it's like trying to fit a square peg in a round hole—it just won't work.

Let's consider an elementary example. Suppose we have two 2x2 matrices (a matrix with 2 rows and 2 columns).

Let's call them AA and BB:

To multiply these matrices, we take the first row of the first matrix and multiply it element by element with the first column of the second matrix, and then add up the results. So, the first element of the resulting matrix would be (15)+(27)=5+14=19(1 \cdot 5) + (2 \cdot 7) = 5 + 14 = 19.

We do this for each row of the first matrix and each column of the second matrix. If you follow these steps, the calculations will yield the resulting matrix CC:

However, as the size of the matrices increase, the computation becomes more complex and time-consuming. This is because for an n×nn \times n matrix, we're doing n3n^3 operations! If you've ever tried to count the stars in the sky, you'll understand that this can take a while. And that's where our friend, Strassen's algorithm, comes in to speed things up!

Introducing Mr. Strassen 🧐

When it comes to matrix multiplication, you might be thinking, "Well, that's a lot of multiplication and addition. There must be a faster way." And you'd be right! That's exactly what Volker Strassen thought back in 1969 when he proposed his groundbreaking algorithm.

Strassen's algorithm is like a clever shortcut. Instead of doing the standard n3n^3 operations, it does just about nlog27n^{log_2 7} operations. Now, that might seem like a bunch of mathematical gibberish, but it means that the algorithm is significantly faster, especially for large matrices. It's like taking a high-speed train instead of a horse-drawn carriage!

But how does it do this? Well, Strassen's algorithm uses a strategy called "divide and conquer." Here's how it works:

Step 1

First, we divide each matrix into four equal parts. For our example, let's imagine we're working with 4x4 matrices. These matrices will be divided into four 2x2 sub-matrices.

After dividing the matrices, we'll have blocks for each matrix, which we'll denote as A11,A12,A21,A22A11, A12, A21, A22 for matrix A, and B11,B12,B21,B22B11, B12, B21, B22 for matrix B. Each of these blocks represents one of the 2x2 sub-matrices.

A11A11 essentially means matrix A's row 1 column 1. Similarly B11B11 means matrix B's row 1 and column 1, and so on.

Step 2

Then, we perform seven multiplications and several additions and subtractions to compute the product. This is different from the standard algorithm that performs eight multiplications. One less multiplication might not seem like much, but remember, we're dealing with matrices, not single numbers. So, this one less multiplication can save a lot of time!

Now, we’re ready to perform the seven multiplications, which are the heart of Strassen’s algorithm. We'll define seven products, M1 through M7, that involve the addition or subtraction of certain blocks and their subsequent multiplication. We'll go through these one by one:

  • M1=(A11+A22)(B11+B22)M1 = (A11 + A22) \cdot (B11 + B22)
  • M2=(A21+A22)B11M2 = (A21 + A22) \cdot B11
  • M3=(B12B22)A11M3 = (B12 - B22) \cdot A11
  • M4=(B21B11)A22M4 = (B21 - B11) \cdot A22
  • M5=(A11+A12)B22M5 = (A11 + A12) \cdot B22
  • M6=(A21A11)(B11+B12)M6 = (A21 - A11) \cdot (B11 + B12)
  • M7=(A12A22)(B21+B22)M7 = (A12 - A22) \cdot (B21 + B22)

If you're wondering how Strassen came up with these, don't worry! It's the result of some very clever mathematical insights that are beyond our scope today. Just know that these operations are carefully chosen to allow us to compute the final product using fewer multiplications.

And here's the magical part: each of these operations can be performed independently! This means that if we're using a computer, we can calculate each of these products in parallel, which can significantly speed up the process.

Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.