# Solving the Permutation Rank problem using Python

By John Lekberg on March 04, 2020.

This week's post is about solving an interview problem: the "Permutation Rank" problem. You will learn:

- How to solve this problem using a brute force algorithm.
- How to analyze the time complexity of the brute force algorithm.
- How to use combinatorics to develop an efficient algorithm.

# Problem statement

Consider all permutations of the string "ERDOS". (E.g. "DESRO" is a permutation.) Order the permutations lexicographically, beginning with "DEORS" and ending with "SROED". What is the position of "ERDOS"? ("DEORS" is at position 1.)

# A brute force solution

A brute force solution of the "Permutation Rank" problem will:

- Generate all permutations of "ERDOS".
- Order the permutations lexicographically.
- Find the position of "ERDOS".

"ERDOS" has 5 characters, so there 120 (5!) permutations. 120 permutations is a small enough state space that the brute force algorithm is feasible. Here's a Python implementation:

`import itertools def permutation_position_brute_force(word): ordered_perms = itertools.permutations(sorted(word)) goal_perm = tuple(word) for position, perm in enumerate(ordered_perms, start=1): if perm == goal_perm: return position permutation_position_brute_force("ERDOS")`

```
37
```

This code uses the built-in function sorted to generate the first permutation in the lexicographic ordering:

`sorted("ERDOS")`

```
['D', 'E', 'O', 'R', 'S']
```

(Don't worry that `sorted`

returns a list and not a string.
This won't be a problem for the algorithm.)

Then, itertools.permutations generates all permutations:

`list(itertools.permutations(sorted("ERDOS")))`

```
[('D', 'E', 'O', 'R', 'S'),
('D', 'E', 'O', 'S', 'R'),
('D', 'E', 'R', 'O', 'S'),
('D', 'E', 'R', 'S', 'O'),
...
('S', 'R', 'E', 'O', 'D'),
('S', 'R', 'O', 'D', 'E'),
('S', 'R', 'O', 'E', 'D')]
```

If the input is sorted, then `itertools.permutations`

will generate the
permutation tuples in lexicographic order.
(Read the documentation of `itertools.permutations`

for more information.)

`itertools.permutations`

generates tuples like `('D', 'E', 'O', 'R', 'S')`

instead of strings like `'DEORS'`

.
I have two ways to deal with this:

- I can examine each permutation tuple and use
`"".join`

to turn the tuple into a string. This is one operation for each permutation:`n`! operations. - I can turn the goal string
`"ERDOS"`

into a goal tuple`("E", "R", "D", "O", "S")`

. This is one operation.

I choose to turn the goal string, `word`

, into a goal tuple, `goal_perm`

,
because that uses fewer operations.

I use enumerate to associate a position with each
permutation, starting at position `1`

.

How well does the brute force solution scale to larger strings with unique characters?

- For an
`n`-character word, there are`n`! permutations.

So the time complexity of `permutation_position_brute_force`

is

O(

n!)

This means that if running `permutation_position_brute_force`

on a 10-character
word took 1 second, then running the algorithm on a 20-character word would
take over 21,000 years!
There's got to be a quicker way to solve this.

# Counting the number of permutations before the goal

The brute force algorithms counts how many permutations appear before the goal permutation by counting each permutation, one by one. But there is a way to count how many permutations appear before the goal permutation without counting permutations one-by-one.

Here are patterns of the permutations that come before `ERDOS`

:

`D****`

. Strings like`DEORS`

and`DRESO`

. There are 4 unpicked characters in the pattern`D****`

, so there are 24 (4!) permutations of`ERDOS`

that look like`D****`

.`ED***`

. Strings like`EDORS`

and`EDSOR`

. There are 3 unpicked characters in the pattern`ED***`

, so there are 6 (3!) permutations of`ERDOS`

that look like`ED***`

.`EO***`

. Strings like`EODSR`

and`EOSDR`

. There are 3 unpicked characters in the pattern`EO***`

, so there are 6 (3!) permutations of`ERDOS`

that look like`EO***`

.

Because all the patterns are mutually exclusive, I
can sum their counts to get the total number of permutations before `ERDOS`

:

24 + 6 + 6 = 36

There are 36 permutations before `ERDOS`

, so `ERDOS`

is at position 37.

How do I know that the three patterns `D****`

, `ED***`

, and `EO***`

are the
only three patterns of permutations before `ERDOS`

?

- I look at the lexicographic ordering of the characters of
`ERDOS`

. The ordering is`DEORS`

. - I look at the goal permutation,
`ERDOS`

. - I start with the first character,
`E`

. In the lexicographic ordering,`D`

comes before`E`

, so I know`D****`

is a pattern that only counts some permutations that come before`ERDOS`

. (`D****`

definitely doesn't match a permutation that comes after`ERDOS`

.) - I continue to the second character,
`R`

. In the lexicographic ordering,`D`

,`E`

, and`O`

come before`R`

. But`E`

is already chosen as the first character, so only`ED***`

and`EO***`

are patterns that count. - I continue to the third letter,
`D`

. In the lexicographic ordering, nothing comes before`D`

, so I continue to the fourth character. - I keep creating patterns of permutations that come before
`ERDOS`

by examining the fourth character, and then examining the fifth character. - I end up with the patterns
`D****`

,`ED***`

, and`EO***`

.

Here's a Python function, `permutation_position_quick`

, that implements the
above procedure:

`import math def permutation_position_quick(word): N = len(word) first = sorted(word) chars_before = { first[i]: first[:i] for i in range(N) } count = 0 for i in range(N): possible_chars = set(chars_before[word[i]]) possible_chars.difference_update(word[:i]) n_wildcards = N - (i + 1) count += ( len(possible_chars) * math.factorial(n_wildcards) ) return 1 + count permutation_position_quick("ERDOS")`

```
37
```

`permutation_position_quick`

uses set objects to easily
remove patterns that would reuse already chosen characters (e.g. invalid
patterns like `EE***`

).

I confirm that `permutation_position_quick`

produces the same results as `permutation_position_brute_force`

:

`p1 = permutation_position_quick p2 = permutation_position_brute_force all( p1(word) == p2(word) for word in itertools.permutations("ERDOS") )`

```
True
```

What is the time complexity of `permutation_position_quick`

?

- The algorithm loops a number of times linear in the number of characters in the string.
There are O(
`n`) loop iterations. - Each loop iteration creates a set of O(
`n`) characters and removes a subset of O(`n`) characters. The time complexity of set.difference_update in this case is O(`n`).

Because there are O(`n`) loop iterations and each iteration takes
O(`n`) operations, the overall time complexity of
`permutation_position_quick`

is

O(

n^{2})

This means that if running `permutation_position_quick`

on a 10-character word
took 1 second, then running the algorithm on a 20-character word would take 4
seconds.
(Compare this with `permutation_position_brute_force`

taking over 21,000
years.)

# In conclusion...

In this post you learned how to use combinatorics to efficiently solve the "Permutation Rank" problem. The key idea is that you don't need to count every permutation one-by-one. To learn more about combinatorics, check out these documents:

- "The Mathematics of 2048: Counting States with Combinatorics" by John Lees-Miller.
- "Combinatorics and Probabilities" by GĂ©rard P. Michon, Ph.D.

My challenge to you:

This post covered solving the "Permutation Rank" problem for words with unique characters, like "ERDOS". How would you solve the "Permutation Rank" problem for words with duplicated characters, like "TERRENCE"?

If you enjoyed this post, let me know. Share this with your friends and stay tuned for next week's post. See you then!