November 21, 2021

We want to write a function that takes a non-empty array of distinct integers and an integer representing a target sum. If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order. If no two numbers sum up to the target sum, the process should return an empty array.

First Approach

My first approach would be to go through the array of integers in a brute force manner. Suppose we have an array of numbers [ 1, 2, 3 ]. We need to figure out all the two-element combinations it can have. If we think about it, we would probably end up doing something like this: -

Figure 1 — Array combination yields

\{\ (1,2), (1,3), (2,3) \ \}

Conceptually in this approach, we try to achieve a reducing set of combinations for two numbers and do some calculation with it. If we can align this approach as a solution to our challenge statement, we can write a brute force algorithm like the following: -

An outer loop which goes through each of element until $n-1$
An inner loop which goes through $n+1$
a condition to check the summation

Program Input — Say we have an array [ -1, 5, -4, 8, 7, 1, 3, 11 ] and a target sum 14. Now let's transform above steps into a pseudocode.

If we were to execute this algorithm, what are the different combinations we go through? Let's write down the iterations and their respective combinations manually.

Iteration	Checked Combinations
1	(-1,5), (-1,-4), (-1,8), (-1,6), (-1,1), (-1,3), (-1,11)
2	(5,-4), (5,8), (5,6), (5,1), (5,3), (5,11)
3	(-4,8), (-4,6), (-4,1), (-4,3), (-4,11)
4	(8,6), (8,1), (8,3), (8,11)
5	(6,1), (6,3), (6,11)
6	(1,3), (1,11)
7	(3,11)

Did you see that? In worst-case scenario we had to evaluate $28$ pairs and in the $7$ ^th iteration we found our matching number pair $(3, 11)$ . However, it's not the same always. It will change based on the indices and therefore combinations will be lesser if we break out of the loop after a successful match.

Now let's do a quick analysis of our 1^st solution.

Problem

While this works pretty well for smaller arrays, it won't scale nicely for larger arrays. The algorithm we just wrote is pretty slow 🐌.

Time Complexity — So, to reiterate, we have a outer loop that goes through the $length(array) - 1$ and a inner loop that goes until $length(array)$ to check the combination of two numbers. This means that we would need to iterate over $n \times n$ .

\large{{n \choose 2} = \sum_{x = 0}^{n - 1} \ \Biggl( \sum_{y = x+1}^{n} f(x,\ y) \Biggr)}

In worst-case scenario, we do

\Large{{n \choose 2}}

^* Iterations just to check whether we have a matching pair.

This algorithm has a polynomial time complexity where it grows proportionally to the square of the input size $n$ . Unfortunately our algorithm takes $O(n^2)$ time to complete.

Space Complexity — We only use two for loops which involves constant variables as indexes $x$ and $y$ . And also, we don't use any extra space. So, our algorithm's space complexity will always be $o(1)$ .

We know that first approach is bad 😕 but can we improve algorithm and make it a bit faster? What happens if we first sort the array huh 🤔?

Second Approach

There's a second way of solving this problem. And it's slightly better than the first one. Initially, in the challenge statement, I didn't mention whether the array is sorted or not. So, what if we sort the array first in ascending order and then figure out a way to solve this?

Program Input — Say we are given a new array [ -4, 13, 1, 3, 5, 6, -1, 11 ] and a target sum of 10. Let's use these inputs for our 2^nd approach.

1^st Operation

First, we have to sort the array in ascending order^*. In order to algorithm to work this must be done and then only we can continue.

2^nd Operation

Then we can allocate two pointers from left and right to walk through the elements $n - 1$ times and operate on these two numbers.

Figure 3 — Set

x

to index

0

and

y

to the last index

n - 1

This way we can solve the problem more optimally instead of using two for loops. With a reasonable sorting algorithm like mergesort or quicksort we could sort the array in $n \ log(n)$ time. But remember we still have to walk through our $n - 1$ times which is equivalent to $o(n)$ .

3^rd Operation (doing the summation)

So far, now we know the array must be sorted first, and we need two pointers to compare. The core logic of this approach is to drive algorithm's state using three predicates. We need to check whether the sum of $A + B$ : -

Is it equal to target sum?
Is it less than target sum?
Is it greater than target sum?

Let's try to write down the algorithm. Remember that, up to this point, we assume that we have already sorted the array and allocated the two pointers. Now it is time to evaluate the above conditions against each pair in every $n$ 'th iteration.

Our loop starts from $x = 0$ and $y = 7$ . At this point our $x$ 's element is $-4$ and $y$ 's element is $13$ (see figure 4). If we add up those two numbers together, we get a total of $9$ which is less than our target sum $10$ . In this case we move the $x$ pointer to the right side. Basically, incrementing $x$ 's index by $1$ . That way we can guarantee that in next iteration we would always get a sum $\gt 9$ .

Alright, in the last iteration we moved $x$ by $1$ and now we are at $x = 1$ and $y = 7$ (see figure 5). Once again, if we sum up $-1$ and $13$ we get a total of $12$ . Now, this is larger than our expected target sum. In this case we move the $y$ pointer to left side. Saying that we want to decrement our $y$ pointer by $1$ .

Got the point? we do this iteratively until matching the target sum or until $x$ and $y$ meets together at the same index.

Well, would you look at that? we reduced the number of iterations we have to go through! we have accomplished significant progress in making this algorithm a bit faster. Now that we found our number pair, we can finally return the result and halt the algorithm. Let's write the JavaScript code for this algorithm now.

While this approach is slightly better than the first, we are back to square one. Why? well, it's the same reason as before; it does not scale well enough for larger arrays. Let me show you the problem.

The algorithm we wrote runs in linearithmic time which tells us its complexity grows proportionally to the array input size with a logarithmic factor. What can we do about it, huh? Can we solve it in linear time?

Dynamic Approach

Up until now, all the approaches we have taken is not very optimal from a time standpoint. Fortunately, there's one other way of solving this problem in a much cooler way. You might have thought about this already from the previous approach. But first, let's list down the things we already know: -

We know what's our target sum is (let it be $z$ ).
We already know one of our addends^* (let it be $x$ ).

So, basically we have two variables at hand before even doing any operations. So, we could write an equation like $\large{x + y = z}$ to represent it (where $y$ is unknown). Where we can isolate the unknown variable. Say for example, $\normalsize{(x + y) = z}$ $\normalsize{\iff}$ $\normalsize{y = (z - x)}$

Now we can find the unknown variable $y$ without any combinations or two pointers. The only caveat is that we need a way memorize this calculated value as we go through the array.

Solution

Using some extra space is okay as long as it's complexity grows in a reasonable size. Now what do you think? for our solution should we use a hashmap? what about a set?

You'd see a lot of examples of two summation problem's dynamic approach in the internet uses a hashmap auxiliary space implementation. But we really do not need key-value pairs for our solution. Instead, simply we can use a set of numbers to track the inversion results.

Formal Proof

\large{R = \{ i \in A \mid S - A_i \}} \\~\\ \therefore \large{\exists i \in A, \ \ P(i): \ (A_i \in R \ \lor A_i \notin R)}

There exists

i

such that may or may not be a member of

R

Elaboration

We need a loop $\sum_{i=0}^{n - 1}$ that goes through each element of the array starting from index $0$ . We need to calculate the inverse set within the loop^* so, we create an empty set $R = \emptyset$ and then for each element we calculate $y = z - A_x$ ^* and now we can place a predicate that checks $A_x$ existence in inverse set $R$ like $P(x): (A_x \in R)$ then return $\{ A_x, y \}$ if $P(x)$ is true. Otherwise, $\neg P(x)$ we union our inverse set with the calculated $y$ value where $R \gets R \ \cup \{ y \}$ and we keep on looping until $n - 1$ .

Switching to New Inputs — For this approach let's use the array [ -7, -5, -3, -1, 0, 1, 3, 5, 7 ] and the target sum -5.

Figure 8 — 1^st iteration

As illustrated, in the first iteration we start off with a empty set named $R$ . Initially our loop starts from index $0$ where $i$ is index variable. In the first iteration we don't have any elements. So, therefore we immediately add the calculated value $2 = -5 - (-7)$ to the set $R$ and move on the next element.

Figure 9 — 2^nd iteration

In the second iteration, first we check whether element at index $1$ an element of $R$ . We can see that $-5 \notin R$ so, we add our inverse calculation $0 = -5 - (-5)$ to set $R$ and continue...

Figure 10 — 3^rd iteration

In the third iteration, again we check whether element at index $2$ an element of $R$ . We can see that $-3 \notin R$ so, we do our inverse calculation $-2 = -5 - (-3)$ and add it to set $R$ and continue.

Figure 11 — 4^th iteration

Woah! fourth iteration already? again we check whether element at index $3$ an element of $R$ . We can see that $-1 \notin R$ so, we do our inverse calculation $-4 = -5 - (-1)$ and add it to set $R$ and continue.

Figure 12 — 5^th iteration

We are in the fifth iteration! and would you look at that! we just found $0$ in our set $R$ . This means our inverse got a match! Now we can return these two elements like $\{ A_x, -5 - A_x \}$ where $A_x$ is $0$ .

Woohoo now we have idea on how it works, let's write the pseudocode.

Time & Space Complexity

In this approach we sorely rely on dynamic programming techniques. And we were able to solve it $o(n)$ time and $o(n)$ space complexity. This is the optimal way of solving this problem.

Summary

Overall, I think even though two summation is a very easy challenge, we can learn a lot from it. How simple algebraic equations can help to solve complex problems more elegantly.

Until next time. Thanks for reading!

Well, now what?

You can navigate to more writings from here. Connect with me on LinkedIn for a chat.

1. 2024
  May
  1. Debugging a Running Java App in Docker
    29th
2. February
  1. Why is it UTC and Not CUT?
    21st
1. 2023
  December
  1. Deep Prop Drilling in ReactJS
    26th
2. October
  1. Eigenvectors
    24th
  2. Java's Fork/Join Framework
    21st
3. August
  1. TypeScript's Omit & Pick
    10th
4. June
  1. reverse() vs toReversed()
    28th
5. May
  1. Integrating JUnit in Maven Projects
    25th
6. March
7. January
  1. Fast forward videos with ffmpeg
    18th
  2. Rotate Y-Axis of a 2D Vector
    5th
1. 2022
  December
  1. Synchronizing time
    31st
2. November
  1. Vector rotation
    20th
  2. Sed find and replace
    14th
3. September
  1. Asgardeo Try it Application
    6th
4. August
  1. Flatten error constraints
    11th
5. July
  1. Good Git commit messages
    24th
6. March
  1. Asgardeo JIT User Provisioning
    9th
7. February
  1. Monotonic Arrays
    25th
  2. How GOROOT and GOPATH Works
    1st
1. 2021
  November
  1. Two summation
    21st
  2. How I built my blog with NextJS
    20th

Well, now what?

2024

May

February

2023

December

October

August

June

May

March

January

2022

December

November

September

August

July

March

February

2021

November