Solving the Permutation Rank problem using Python

By John Lekberg on March 04, 2020.

This week's post is about solving an interview problem: the "Permutation Rank" problem. You will learn:

How to solve this problem using a brute force algorithm.
How to analyze the time complexity of the brute force algorithm.
How to use combinatorics to develop an efficient algorithm.

Problem statement

Consider all permutations of the string "ERDOS". (E.g. "DESRO" is a permutation.) Order the permutations lexicographically, beginning with "DEORS" and ending with "SROED". What is the position of "ERDOS"? ("DEORS" is at position 1.)

A brute force solution

A brute force solution of the "Permutation Rank" problem will:

Generate all permutations of "ERDOS".
Order the permutations lexicographically.
Find the position of "ERDOS".

"ERDOS" has 5 characters, so there 120 (5!) permutations. 120 permutations is a small enough state space that the brute force algorithm is feasible. Here's a Python implementation:

import itertools

def permutation_position_brute_force(word):
    ordered_perms = itertools.permutations(sorted(word))
    goal_perm = tuple(word)
    for position, perm in enumerate(ordered_perms, start=1):
        if perm == goal_perm:
            return position

permutation_position_brute_force("ERDOS")

This code uses the built-in function sorted to generate the first permutation in the lexicographic ordering:

sorted("ERDOS")

['D', 'E', 'O', 'R', 'S']

(Don't worry that sorted returns a list and not a string. This won't be a problem for the algorithm.)

Then, itertools.permutations generates all permutations:

list(itertools.permutations(sorted("ERDOS")))

[('D', 'E', 'O', 'R', 'S'),
 ('D', 'E', 'O', 'S', 'R'),
 ('D', 'E', 'R', 'O', 'S'),
 ('D', 'E', 'R', 'S', 'O'),
 ...
 ('S', 'R', 'E', 'O', 'D'),
 ('S', 'R', 'O', 'D', 'E'),
 ('S', 'R', 'O', 'E', 'D')]

If the input is sorted, then itertools.permutations will generate the permutation tuples in lexicographic order. (Read the documentation of itertools.permutations for more information.)

itertools.permutations generates tuples like ('D', 'E', 'O', 'R', 'S') instead of strings like 'DEORS'. I have two ways to deal with this:

I can examine each permutation tuple and use "".join to turn the tuple into a string. This is one operation for each permutation: n! operations.
I can turn the goal string "ERDOS" into a goal tuple ("E", "R", "D", "O", "S"). This is one operation.

I choose to turn the goal string, word, into a goal tuple, goal_perm, because that uses fewer operations.

I use enumerate to associate a position with each permutation, starting at position 1.

How well does the brute force solution scale to larger strings with unique characters?

For an n-character word, there are n! permutations.

So the time complexity of permutation_position_brute_force is

O(n!)

This means that if running permutation_position_brute_force on a 10-character word took 1 second, then running the algorithm on a 20-character word would take over 21,000 years! There's got to be a quicker way to solve this.

Counting the number of permutations before the goal

The brute force algorithms counts how many permutations appear before the goal permutation by counting each permutation, one by one. But there is a way to count how many permutations appear before the goal permutation without counting permutations one-by-one.

Here are patterns of the permutations that come before ERDOS:

D****. Strings like DEORS and DRESO. There are 4 unpicked characters in the pattern D****, so there are 24 (4!) permutations of ERDOS that look like D****.
ED***. Strings like EDORS and EDSOR. There are 3 unpicked characters in the pattern ED***, so there are 6 (3!) permutations of ERDOS that look like ED***.
EO***. Strings like EODSR and EOSDR. There are 3 unpicked characters in the pattern EO***, so there are 6 (3!) permutations of ERDOS that look like EO***.

Because all the patterns are mutually exclusive, I can sum their counts to get the total number of permutations before ERDOS:

24 + 6 + 6 = 36

There are 36 permutations before ERDOS, so ERDOS is at position 37.

How do I know that the three patterns D****, ED***, and EO*** are the only three patterns of permutations before ERDOS?

I look at the lexicographic ordering of the characters of ERDOS. The ordering is DEORS.
I look at the goal permutation, ERDOS.
I start with the first character, E. In the lexicographic ordering, D comes before E, so I know D**** is a pattern that only counts some permutations that come before ERDOS. (D**** definitely doesn't match a permutation that comes after ERDOS.)
I continue to the second character, R. In the lexicographic ordering, D, E, and O come before R. But E is already chosen as the first character, so only ED*** and EO*** are patterns that count.
I continue to the third letter, D. In the lexicographic ordering, nothing comes before D, so I continue to the fourth character.
I keep creating patterns of permutations that come before ERDOS by examining the fourth character, and then examining the fifth character.
I end up with the patterns D****, ED***, and EO***.

Here's a Python function, permutation_position_quick, that implements the above procedure:

import math

def permutation_position_quick(word):
    N = len(word)
    
    first = sorted(word)
    
    chars_before = {
        first[i]: first[:i]
        for i in range(N)
    }
    
    count = 0
    for i in range(N):
        possible_chars = set(chars_before[word[i]])
        possible_chars.difference_update(word[:i])
        n_wildcards = N - (i + 1)
           
        count += (
          len(possible_chars) * math.factorial(n_wildcards)
        )
    
    return 1 + count

permutation_position_quick("ERDOS")

permutation_position_quick uses set objects to easily remove patterns that would reuse already chosen characters (e.g. invalid patterns like EE***).

I confirm that permutation_position_quick produces the same results as permutation_position_brute_force:

p1 = permutation_position_quick
p2 = permutation_position_brute_force
all(
    p1(word) == p2(word)
    for word in itertools.permutations("ERDOS")
)

True

What is the time complexity of permutation_position_quick?

The algorithm loops a number of times linear in the number of characters in the string. There are O(n) loop iterations.
Each loop iteration creates a set of O(n) characters and removes a subset of O(n) characters. The time complexity of set.difference_update in this case is O(n).

Because there are O(n) loop iterations and each iteration takes O(n) operations, the overall time complexity of permutation_position_quick is

O(n²)

This means that if running permutation_position_quick on a 10-character word took 1 second, then running the algorithm on a 20-character word would take 4 seconds. (Compare this with permutation_position_brute_force taking over 21,000 years.)

In conclusion...

In this post you learned how to use combinatorics to efficiently solve the "Permutation Rank" problem. The key idea is that you don't need to count every permutation one-by-one. To learn more about combinatorics, check out these documents:

My challenge to you:

This post covered solving the "Permutation Rank" problem for words with unique characters, like "ERDOS". How would you solve the "Permutation Rank" problem for words with duplicated characters, like "TERRENCE"?

If you enjoyed this post, let me know. Share this with your friends and stay tuned for next week's post. See you then!

(If you spot any errors or typos on this post, contact me via my contact page.)