May 12, 2023

Hey there! You've probably encountered matrices if you've ever dabbled in mathematics or computer science. You'd know that we can add, subtract, and even multiply these wonderful rectangular arrays of numbers together. Matrix multiplication, in particular, has a wide range of applications, from computer graphics to machine learning. But did you know there are more efficient ways than how we usually learn to multiply matrices?

That's where Strassen's algorithm comes in. Invented by Volker Strassen in 1969, this algorithm revolutionized how we think about matrix multiplication by showing us a faster way to do it. If you're like me, you might wonder: "How does it achieve this?" Well, that's what we're going to explore in this article.

Before diving into the deep end with Strassen's algorithm, let's dip our toes in the shallow end first—understanding basic matrix multiplication and its constraints.

Matrix Multiplication 101

Imagine you have two groups of people. The first group is handing out apples, and the second is counting them. Matrices are like that. In matrix multiplication, we distribute and count things based on specific rules. Each number in one matrix corresponds with a number in the other matrix, and we're adding up the results of their multiplication. But, unlike in basic multiplication, the order matters. Multiplying matrix $A$ by matrix $B$ is not the same as multiplying matrix $B$ by matrix $A$ . If you've ever tried to put on your right shoe before your left shoe while standing up, you'll understand why order matters!

But, just like you can't start handing out apples before they're ripe, there are a few rules we have to follow before we can multiply matrices. The most important rule is that the number of columns in the first matrix must be equal to the number of rows in the second matrix. In other words, if we have a matrix $A$ of dimensions $m \times n$ and another matrix $B$ of dimensions $p \times q$ , we can only multiply $A$ and $B$ if $n=p$ . If this condition isn't met, it's like trying to fit a square peg in a round hole—it just won't work.

Let's consider an elementary example. Suppose we have two 2x2 matrices (a matrix with 2 rows and 2 columns).

Let's call them $A$ and $B$ :

To multiply these matrices, we take the first row of the first matrix and multiply it element by element with the first column of the second matrix, and then add up the results. So, the first element of the resulting matrix would be $(1 \cdot 5) + (2 \cdot 7) = 5 + 14 = 19$ .

We do this for each row of the first matrix and each column of the second matrix. If you follow these steps, the calculations will yield the resulting matrix $C$ :

However, as the size of the matrices increase, the computation becomes more complex and time-consuming. This is because for an $n \times n$ matrix, we're doing $n^3$ operations! If you've ever tried to count the stars in the sky, you'll understand that this can take a while. And that's where our friend, Strassen's algorithm, comes in to speed things up!

Introducing Mr. Strassen 🧐

When it comes to matrix multiplication, you might be thinking, "Well, that's a lot of multiplication and addition. There must be a faster way." And you'd be right! That's exactly what Volker Strassen thought back in 1969 when he proposed his groundbreaking algorithm.

Strassen's algorithm is like a clever shortcut. Instead of doing the standard $n^3$ operations, it does just about $n^{log_2 7}$ operations. Now, that might seem like a bunch of mathematical gibberish, but it means that the algorithm is significantly faster, especially for large matrices. It's like taking a high-speed train instead of a horse-drawn carriage!

But how does it do this? Well, Strassen's algorithm uses a strategy called "divide and conquer." Here's how it works:

Step 1

First, we divide each matrix into four equal parts. For our example, let's imagine we're working with 4x4 matrices. These matrices will be divided into four 2x2 sub-matrices.

After dividing the matrices, we'll have blocks for each matrix, which we'll denote as $A11, A12, A21, A22$ for matrix A, and $B11, B12, B21, B22$ for matrix B. Each of these blocks represents one of the 2x2 sub-matrices.

A11

essentially means matrix A's row 1 column 1. Similarly

B11

means matrix B's row 1 and column 1, and so on.

Step 2

Then, we perform seven multiplications and several additions and subtractions to compute the product. This is different from the standard algorithm that performs eight multiplications. One less multiplication might not seem like much, but remember, we're dealing with matrices, not single numbers. So, this one less multiplication can save a lot of time!

Now, we’re ready to perform the seven multiplications, which are the heart of Strassen’s algorithm. We'll define seven products, M1 through M7, that involve the addition or subtraction of certain blocks and their subsequent multiplication. We'll go through these one by one:

$M1 = (A11 + A22) \cdot (B11 + B22)$
$M2 = (A21 + A22) \cdot B11$
$M3 = (B12 - B22) \cdot A11$
$M4 = (B21 - B11) \cdot A22$
$M5 = (A11 + A12) \cdot B22$
$M6 = (A21 - A11) \cdot (B11 + B12)$
$M7 = (A12 - A22) \cdot (B21 + B22)$

If you're wondering how Strassen came up with these, don't worry! It's the result of some very clever mathematical insights that are beyond our scope today. Just know that these operations are carefully chosen to allow us to compute the final product using fewer multiplications.

And here's the magical part: each of these operations can be performed independently! This means that if we're using a computer, we can calculate each of these products in parallel, which can significantly speed up the process.

Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.

1. 2024
  October
  1. The World is Stochastic
    18th
2. May
  1. Debugging a Running Java App in Docker
    29th
3. February
  1. Why is it UTC and Not CUT?
    21st
1. 2023
  December
  1. Deep Prop Drilling in ReactJS
    26th
2. October
  1. Eigenvectors
    24th
  2. Java's Fork/Join Framework
    21st
3. August
  1. TypeScript's Omit & Pick
    10th
4. June
  1. reverse() vs toReversed()
    28th
5. May
  1. Integrating JUnit in Maven Projects
    25th
6. March
7. January
  1. Fast forward videos with ffmpeg
    18th
  2. Rotate Y-Axis of a 2D Vector
    5th
1. 2022
  December
  1. Synchronizing time
    31st
2. November
  1. Vector rotation
    20th
  2. Sed find and replace
    14th
3. September
  1. Asgardeo Try it Application
    6th
4. August
  1. Flatten error constraints
    11th
5. July
  1. Good Git commit messages
    24th
6. March
  1. Asgardeo JIT User Provisioning
    9th
7. February
  1. Monotonic Arrays
    25th
  2. How GOROOT and GOPATH Works
    1st
1. 2021
  November
  1. Two summation
    21st
  2. How I built my blog with NextJS
    20th

Matrix Multiplication 101

Introducing Mr. Strassen 🧐

Step 1

Step 2

Well, now what?

2024

October

May

February

2023

December

October

August

June

May

March

January

2022

December

November

September

August

July

March

February

2021

November