Understanding Kadane’s solution to the maximum subarray problem
9 May 2017 · 524 words

There is something called the maximum subarray problem, and it goes something like this.

You are given an array of numbers: something like 3,5,-2,-1,5,3,1,-4,-6,5.  Your job is to come up with the subarray with the highest sum.

For this example it would be the subarray [3,5,-2,-1,5,3,1], which has a sum of 14. No other subarray gives a higher sum.

Let’s look at a few more:

• The sequence -3,-3,-2,-4,-2,-5,-4 has the highest sum being -2, given by the subarray [-2].
• The sequence 1,2,-1,-2,1  has the highest sum being 3, given by the subarray [1,2].
• The sequence 3,2,1,4 has the highest sum being 10, given by the subarray [3,2,1,4].

It’s an interesting problem with many applications. How do we solve it?

## Solving the problem

While there are many ways to solve this problem, the best solution is Kadane’s Algorithm. It’s a deceptively simple algorithm, yet also deceptively difficult to understand.

Here’s the Python algorithm given on Wikipedia:

``````def max_subarray(A):
max_ending_here = max_so_far = A
for x in A[1:]:
max_ending_here = max(x, max_ending_here + x)
max_so_far = max(max_so_far, max_ending_here)
return max_so_far``````

Here’s an R implementation, with a trace added to help figure out what’s going on.

``````find_max_subarray = function(arr){
max_ending_here = arr
max_so_far = arr
for(i in 2:length(arr)){
max_ending_here = max(arr[i], max_ending_here + arr[i])
max_so_far = max(max_so_far, max_ending_here)

print(paste( max_ending_here,max_so_far,arr[i])) #for tracing
}
return(max_so_far)
}``````

Try running the algorithm yourself. Feed it different subsequences and look at the trace to see if you can figure out how it works.

Understanding this algorithm is not a simple task. Here’s how I understand it.

For each number in the sequence, we do this: Each number “decides” if it wants to use the previous subarray sum, or if it wants to start a new subarray from scratch. At every comparison, it also checks to see if its subarray sum is higher than any previous one.

Let’s look at an example – the sequence 1,-2,3,-1,2.

We iterate through the numbers, starting at the second number in the sequence (-2).

At this point, the maximum subarray sum seen so far is 1 - the sum of the first element in the sequence. This is also the previous subarray sum. The new subarray sum of -1 is passed to the next number. This is the sum of the current subarray [1,-2].

The next number (3) gets the previous subarray sum of-1: We start a new sequence at this point. The current subarray is  and the current subarray sum is 3. The maximum subarray sum is also updated, since 3 is higher than everything we had before.

The next number (-1) is passed the previous subarray sum of 3: Now the subarray is [3,-1] and the subarray sum is 2. Do you see how this goes now?

The last number is passed the subarray sum of 2: The subarray at this point is [3,-1,2] and the subarray sum is 4. The max_so_far variable is updated to read 4.

At this point the sequence has ended and max_so_far is returned - which is 4. This is the highest subarray sum of the sequence, and the subarray was [3,-1,2].

And that’s it!