There is something called the maximum subarray problem, and it goes something like this.

You are given an array of numbers: something like **3,5,-2,-1,5,3,1,-4,-6,5**. Your job is to come up with the subarray with the highest sum.

For this example it would be the subarray **[3,5,-2,-1,5,3,1]**, which has a sum of 14. No other subarray gives a higher sum.

Let’s look at a few more:

- The sequence
**-3,-3,-2,-4,-2,-5,-4**has the highest sum being**-2**, given by the subarray**[-2]**. - The sequence
**1,2,-1,-2,1**has the highest sum being**3**, given by the subarray**[1,2]**. - The sequence
**3,2,1,4**has the highest sum being**10**, given by the subarray**[3,2,1,4]**.

It’s an interesting problem with many applications. How do we solve it?

## Solving the problem

While there are many ways to solve this problem, the best solution is Kadane’s Algorithm. It’s a deceptively simple algorithm, yet also deceptively difficult to understand.

Here’s the Python algorithm given on Wikipedia:

```
def max_subarray(A):
max_ending_here = max_so_far = A[0]
for x in A[1:]:
max_ending_here = max(x, max_ending_here + x)
max_so_far = max(max_so_far, max_ending_here)
return max_so_far
```

Here’s an R implementation, with a trace added to help figure out what’s going on.

```
find_max_subarray = function(arr){
max_ending_here = arr[1]
max_so_far = arr[1]
for(i in 2:length(arr)){
max_ending_here = max(arr[i], max_ending_here + arr[i])
max_so_far = max(max_so_far, max_ending_here)
print(paste( max_ending_here,max_so_far,arr[i])) #for tracing
}
return(max_so_far)
}
```

Try running the algorithm yourself. Feed it different subsequences and look at the trace to see if you can figure out how it works.

Understanding this algorithm is not a simple task. Here’s how I understand it.

## Understanding Kadane’s Algorithm

For each number in the sequence, we do this:

Each number “decides” if it wants to use the previous subarray sum, or if it wants to start a new subarray from scratch. At every comparison, it also checks to see if its subarray sum is higher than any previous one.

Let’s look at an example – the sequence **1,-2,3,-1,2**.

We iterate through the numbers, starting at the second number in the sequence (-2).

At this point, the maximum subarray sum seen so far is 1 - the sum of the first element in the sequence. This is also the previous subarray sum.

The new subarray sum of -1 is passed to the next number. This is the sum of the current subarray [1,-2].

The next number (3) gets the previous subarray sum of-1:

We start a new sequence at this point. The current subarray is [3] and the current subarray sum is 3. The maximum subarray sum is also updated, since 3 is higher than everything we had before.

The next number (-1) is passed the previous subarray sum of 3:

Now the subarray is [3,-1] and the subarray sum is 2. Do you see how this goes now?

The last number is passed the subarray sum of 2:

The subarray at this point is [3,-1,2] and the subarray sum is 4. The max_so_far variable is updated to read 4.

At this point the sequence has ended and max_so_far is returned - which is 4. This is the highest subarray sum of the sequence, and the subarray was [3,-1,2].

And that’s it!