# Strassen's Algorithm

Hey there! You've probably encountered matrices if you've ever dabbled in mathematics or computer science. You'd know that we can *add*, *subtract*, and even *multiply* these wonderful rectangular arrays of numbers together. Matrix multiplication, in particular, has a wide range of applications, from computer graphics to machine learning. But did you *know* there are more efficient ways than how we usually learn to multiply matrices?

That's where Strassen's algorithm comes in. Invented by Volker Strassen in 1969, this algorithm revolutionized how we think about matrix multiplication by showing us a faster way to do it. If you're like me, you might wonder: **"How does it achieve this?"** Well, that's what we're going to explore in this article.

Before diving into the deep end with Strassen's algorithm, let's dip our toes in the shallow end first—*understanding basic matrix multiplication and its constraints*.

# Matrix Multiplication 101

Imagine you have *two* groups of people. The first group is handing out apples, and the second is counting them. Matrices are like that. In matrix multiplication, we *distribute* and count things based on specific **rules**. Each number in one matrix corresponds with a number in the other matrix, and we're adding up the results of their multiplication. But, unlike in basic multiplication, the order matters. Multiplying matrix $A$ by matrix $B$ is not the same as multiplying matrix $B$ by matrix $A$. If you've ever tried to put on your right shoe before your left shoe while standing up, you'll understand why order matters!

But, just like you can't start handing out apples before they're ripe, there are a few **rules** we have to follow before we can multiply matrices. The most important rule is that *the number of columns in the first matrix must be equal to the number of rows in the second matrix*. In other words, if we have a matrix $A$ of dimensions $m \times n$ and another matrix $B$ of dimensions $p \times q$, we can only multiply $A$ and $B$ if $n=p$. If this condition isn't met, it's like trying to fit a square peg in a round hole—it just won't work.

Let's consider an elementary example. Suppose we have two `2x2`

matrices (a matrix with 2 rows and 2 columns).

Let's call them $A$ and $B$:

To multiply these matrices, we take the first row of the first matrix and multiply it element by element with the first column of the second matrix, and then add up the results. So, the first element of the resulting matrix would be $(1 \cdot 5) + (2 \cdot 7) = 5 + 14 = 19$.

We do this for *each row* of the first matrix and each column of the second matrix. If you follow these steps, the calculations will yield the resulting matrix $C$:

However, as the size of the matrices increase, the computation becomes more complex and time-consuming. This is because for an $n \times n$ matrix, we're doing $n^3$ operations! If you've ever tried to *count the stars* in the sky, you'll understand that this can take a while. And that's where our friend, Strassen's algorithm, comes in to speed things up!

# Introducing Mr. Strassen 🧐

When it comes to matrix multiplication, you might be thinking, "Well, that's a lot of multiplication and addition. There must be a faster way." And you'd be right! That's exactly what Volker Strassen thought back in 1969 when he proposed his groundbreaking algorithm.

Strassen's algorithm is like a clever shortcut. Instead of doing the standard $n^3$ operations, it does just about $n^{log_2 7}$ operations. Now, that might seem like a bunch of mathematical gibberish, but it means that the algorithm is significantly *faster*, especially for large matrices. It's like taking a high-speed train instead of a horse-drawn carriage!

But how does it do this? Well, Strassen's algorithm uses a strategy called **"divide and conquer."** Here's how it works:

## Step 1

First, we divide each matrix into four equal parts. For our example, let's imagine we're working with `4x4`

matrices. These matrices will be divided into four `2x2`

sub-matrices.

After dividing the matrices, we'll have blocks for each matrix, which we'll denote as $A11, A12, A21, A22$ for matrix A, and $B11, B12, B21, B22$ for matrix B. Each of these blocks represents one of the `2x2`

sub-matrices.

## Step 2

Then, we perform **seven** multiplications and several *additions* and *subtractions* to compute the product. This is different from the standard algorithm that performs eight multiplications. One less multiplication might not seem like much, but remember, we're dealing with matrices, not single numbers. So, this one less multiplication can save a lot of time!

Now, we’re ready to perform the seven multiplications, which are the heart of Strassen’s algorithm. We'll define seven products, M1 through M7, that involve the addition or subtraction of certain blocks and their subsequent multiplication. We'll go through these one by one:

- $M1 = (A11 + A22) \cdot (B11 + B22)$
- $M2 = (A21 + A22) \cdot B11$
- $M3 = (B12 - B22) \cdot A11$
- $M4 = (B21 - B11) \cdot A22$
- $M5 = (A11 + A12) \cdot B22$
- $M6 = (A21 - A11) \cdot (B11 + B12)$
- $M7 = (A12 - A22) \cdot (B21 + B22)$

If you're wondering how Strassen came up with these, don't worry! It's the result of some very *clever* mathematical insights that are beyond our scope today. Just know that these operations are carefully chosen to allow us to compute the final product using fewer multiplications.

And here's the *magical* part: each of these operations can be performed independently! This means that if we're using a computer, we can calculate each of these products in parallel, which can significantly speed up the process.

# Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.