Solving the Maximum Sum Subarray problem in Python
By John Lekberg on September 12, 2020.
This week's post is about solving the "Maximum Sum Subarray" problem. You will learn:
- How to create a brute force solution.
- How to use Kadane's algorithm to create a faster solution.
Problem Statement
You are given an array of numbers. E.g.
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
Your goal is to find the subarray (a chunk of the array) that has the largest sum, e.g.
arr[3:7]
[4, -1, 2, 1]
sum(arr[3:7])
6
How I represent the data in the problem
I represent the array as a list. E.g.
arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
I represent the subarray as a pair of indices (i,j) that slice the original array. E.g.
i, j = 3, 7 arr[i:j]
[4, -1, 2, 1]
Creating a brute force solution
A simple brute force solution will
- Generate every potential solution (i, j).
- Find the solution that maximizes the sum.
Here's a Python function, max_sum_bf
, that does this:
from itertools import combinations def max_sum_bf(arr): """Solve the 'Maximum Sum Subarray' problem using brute force. Return the slice indices (i, j) that generate the solution, arr[i:j]. arr -- a sequence of numbers. E.g. [1, 2, -3, 5]. """ n = len(arr) candidates = combinations(range(n + 1), 2) def arr_sum(x): """Return the sum of a subarray of arr. x -- slice indices (i, j). """ i, j = x return sum(arr[i:j]) return max(candidates, key=arr_sum) max_sum_bf([-2, 1, -3, 4, -1, 2, 1, -5, 4])
(3, 7)
This uses itertools.combinations to generate all potential indices (i,j).
What's the time complexity of this solution? For an array of n numbers,
- There are O(n2) candidates. (There are C(n+1, 2) candidates.)
- Computing
arr_sum
for each candidate takes O(n) operations.
As a result, the overall time complexity of this algorithm is
O(n3)
Creating a faster solution using Kadane's algorithm
Joseph Kadane created an algorithm, called "Kadane's algorithm", that solves the "Maximum Sum Subarray" problem in O(n). Here's a Python implementation of Kadane's algorithm:
def kadanes_algorithm(*, a): """Run Kadane's algorithm on an array to solve the maximum sum subarray problem. Returns the indices (k, l+1) that generate the maximum subarray a[k:l+1] of a[0:n]. (Where n = len(a).) a -- A sequence of numbers. E.g. [1, 2, -3, 5]. This algorithm is copied (with modifications) from Tadao Takaoka. "Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication". 2002. https://doi.org/10.1016/S1571-0661(04)00313-5 """ n = len(a) (k, l) = (0, 0) s = None t = 0 j = 0 for i in range(n): t = t + a[i] if s is None or t > s: (k, l) = (j, i) s = t if t < 0: t = 0 j = i + 1 return (k, l + 1) kadanes_algorithm(a=[-2, 1, -3, 4, -1, 2, 1, -5, 4])
(3, 7)
For more information on Kadane's algorithm, read:
- "Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication" by Tadao Takaoka. (See section 3, "Maximum subarray problem".)
- "Maximum subarray problem" via Wikipedia
So, a faster solution to the "Maximum Sum Subarray" problem can simply use
kadanes_algorithm
:
def max_sum_fast(arr): """Solve the 'Maximum Sum Subarray' problem quickly, using Kadane's algorithm. Return the slice indices (i, j) that generate the solution, arr[i:j]. arr -- a sequence of numbers. E.g. [1, 2, -3, 5]. """ return kadanes_algorithm(a=arr) max_sum_fast([-2, 1, -3, 4, -1, 2, 1, -5, 4])
(3, 7)
The time complexity of this algorithm is O(n), because the time complexity of Kadane's algorithm is O(n).
In conclusion...
In this week's post, you learned how to solve the Maximum Sum Subarray problem. You created a brute force solution, and you used Kadane's algorithm to create a faster solution.
My challenge to you:
The Maximum Sum Subarray problem can be generalized to two-dimensional arrays:
"We give a two-dimensional array a[1..m,1..n] as input data. The maximum subarray problem is to maximize the array portion a[k..i,l..j], that is, to obtain such indices (k,l) and (i,j). We suppose the upper-left corner has co-ordinate (1,1)."
-- Tadao Takaoka, "Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication"
I challenge you to:
- Implement a brute force solution to this problem.
- Implement the algorithm that Tadao Takaoka develops in his paper (link above).
If you enjoyed this week's post, share it with your friends and stay tuned for next week's post. See you then!
(If you spot any errors or typos on this post, contact me via my contact page.)