back · main · about · writing · notes · d3 · contact


Understanding Kadane’s solution to the maximum subarray problem
9 May 2017 · 524 words

There is something called the maximum subarray problem, and it goes something like this.

You are given an array of numbers: something like 3,5,-2,-1,5,3,1,-4,-6,5.  Your job is to come up with the subarray with the highest sum.

For this example it would be the subarray [3,5,-2,-1,5,3,1], which has a sum of 14. No other subarray gives a higher sum.

Let’s look at a few more:

It’s an interesting problem with many applications. How do we solve it?

Solving the problem

While there are many ways to solve this problem, the best solution is Kadane’s Algorithm. It’s a deceptively simple algorithm, yet also deceptively difficult to understand.

Here’s the Python algorithm given on Wikipedia:

def max_subarray(A):
    max_ending_here = max_so_far = A[0]
    for x in A[1:]:
        max_ending_here = max(x, max_ending_here + x)
        max_so_far = max(max_so_far, max_ending_here)
    return max_so_far

Here’s an R implementation, with a trace added to help figure out what’s going on.

find_max_subarray = function(arr){
    max_ending_here = arr[1]
    max_so_far = arr[1]
    for(i in 2:length(arr)){
        max_ending_here = max(arr[i], max_ending_here + arr[i])
        max_so_far = max(max_so_far, max_ending_here)
        
        print(paste( max_ending_here,max_so_far,arr[i])) #for tracing 
    }
    return(max_so_far)
}

Try running the algorithm yourself. Feed it different subsequences and look at the trace to see if you can figure out how it works.

Understanding this algorithm is not a simple task. Here’s how I understand it.

Understanding Kadane’s Algorithm

For each number in the sequence, we do this:

Each number “decides” if it wants to use the previous subarray sum, or if it wants to start a new subarray from scratch. At every comparison, it also checks to see if its subarray sum is higher than any previous one.

Let’s look at an example – the sequence 1,-2,3,-1,2.

We iterate through the numbers, starting at the second number in the sequence (-2).

At this point, the maximum subarray sum seen so far is 1 - the sum of the first element in the sequence. This is also the previous subarray sum.

The new subarray sum of -1 is passed to the next number. This is the sum of the current subarray [1,-2].

The next number (3) gets the previous subarray sum of-1:

We start a new sequence at this point. The current subarray is [3] and the current subarray sum is 3. The maximum subarray sum is also updated, since 3 is higher than everything we had before.

The next number (-1) is passed the previous subarray sum of 3:

Now the subarray is [3,-1] and the subarray sum is 2. Do you see how this goes now?

The last number is passed the subarray sum of 2:

The subarray at this point is [3,-1,2] and the subarray sum is 4. The max_so_far variable is updated to read 4.

At this point the sequence has ended and max_so_far is returned - which is 4. This is the highest subarray sum of the sequence, and the subarray was [3,-1,2].

And that’s it!


back · main · about · writing · notes · d3 · contact