29 June 2021 · Problems Maths

Riddler Solitaire (25/06/21 Classic)

Original puzzle

Strategy

There are $N+1$ cards with face values from \(1\) to \(N\) plus a joker in my solitaire deck. An optimal strategy should be to keep flipping if the expected score from flipping the next card is greater than the expected loss.

Suppose we have already flipped \(f\) cards with an accumulated score of \(s\), with \(N+1-f\) cards remaining. If I flip the next card, and it is the joker with probability \(\frac{1}{N+1-f}\), then I will lose \(s\) (for a net loss of \(\frac{s}{N+1-f}\)) if I flip the joker.

If I don't flip the joker, with probability \(\frac{N-f}{N+1-f}\), then I stand to gain \(\frac{1}{N-f}\cdot\Big(\frac{N(N+1)}{2}-s\Big)\)
, which is the average face value remaining. This gives a net expected gain of \(\frac{1}{N+1-f}\cdot\Big(\frac{N(N+1)}{2}-s\Big)\)

Setting the expected gain to be at least the expected loss, we find that we should flip if
$$\begin{align*}\frac{s}{N+1-f}&\leq\frac{1}{N+1-f}\cdot\Big(\frac{N(N+1)}{2}-s\Big)\\\therefore 2s &\leq \frac{N(N+1)}{2}\end{align*}$$which means we should keep flipping if our accumulated score is at most \(\frac{N(N+1)}{4}\). In the case of \(N=10\) of this problem, this threshold works out to be \(27.5\).

Expected score

Later on I show that there is a remarkably accurate (and cute!) approximation for the expected score:\[\color{red}{\boxed{\mathbb{E}(\text{score})\approx\frac{1}{8}N^2 + \frac{7}{24}N}}\]but before that, we need to do some programming in order to obtain exact values, via exhaustive searching.

Attempt 1: \(O(N!)\) time via dumb bashing

An algorithm to obtain an exact value for the expected score is as follows. Take some arbitrary ordering of the value-cards (ignoring for now the placement of the joker). We let \(m\) be the minimum number of cards we need to flip to exceed the threshold value. For example, if the ordering is \((2,5,3,7,8,10,1,4,9,6)\), then picking the first \(m=6\) cards is sufficient to exceed \(27.5\) (with a score of \(35\)). For later purposes, define \(M\) to be the ordered set of the first \(m\) cards.

Now, we consider where to insert the joker. If the joker is inserted after the first value-card, or the second value-card, up to the \((m-1)\)th value-card, then the optimal strategy will end up flipping the joker for a total score of \(0\). However, if the joker is placed after the \(m\)th value-card all the way up to the \(N\)th value-card, then the total score is simply \(\text{sum}(M)\) as you'll never flip the joker. There are \((N+1-m)\) cases that fit this case.

Hence, for a given ordering of value-cards, if \(\text{sum}(M)\) is the sum of the first \(m\) cards, then the expected score is \(\text{sum}(M)\cdot(N+1-m)\). We simply add up this number over all permutations of value-cards and obtain our expected score.

Here is some code that can do this. For the problem of \(N=10\), the expected score is exactly \[\color{red}{\boxed{\mathbb{E}(\text{score})=\frac{616 838 400}{11!}=\frac{10709}{693}\approx 15.453}}\]

import itertools as it
import math

def runN(N):
  scoreSum = 0
  thresh = N*(N+1)/4
  x = it.permutations([i for i in range(1,N+1)])
  for order in list(x):
    m = 0
    curSum = 0
    while curSum <= thresh:
      curSum += order[m]
      m += 1
    scoreSum += sum(order[0:m]) * (N+1-m)

  return scoreSum

for N in range(1,11):
  result = runN(N)
  print(f'N={N} value-cards; {result} total points from {N+1}! permutations; average score {result/math.factorial(N+1)}.')

Output:

N=1 value-cards; 1 total points from 2! permutations; average score 0.5.
N=2 value-cards; 7 total points from 3! permutations; average score 1.1666666666666667.
N=3 value-cards; 48 total points from 4! permutations; average score 2.0.
N=4 value-cards; 384 total points from 5! permutations; average score 3.2.
N=5 value-cards; 3336 total points from 6! permutations; average score 4.633333333333334.
N=6 value-cards; 31704 total points from 7! permutations; average score 6.29047619047619.
N=7 value-cards; 330624 total points from 8! permutations; average score 8.2.
N=8 value-cards; 3761856 total points from 9! permutations; average score 10.366666666666667.
N=9 value-cards; 46402560 total points from 10! permutations; average score 12.787301587301588.
N=10 value-cards; 616838400 total points from 11! permutations; average score 15.453102453102453.

This above solution works OK for low \(N\), but it needs to iterate over \(N!\) permutations, which is not ideal. Optimising it to prune out duplicate permutations is still wildly inefficient. We can do better.

Attempt 2: Pseudo-polynomial (?) time via the subset-sum algorithm

Take an arbitrary ordering of cards, once again ignoring the joker. Recall that \(M\) is the ordered set of cards that just exceeds the threshold; that is, removing the \(m\)th (and final) card will cause the sum of \(M\) to drop below the threshold. This problem then basically reduces to one in which we just need to account for every possible \(M\).

Suppose the last element of \(M\) is \(M_m\). Let \(L\) be the subset of \(M\) without the final card \(M_m\). Hence \(M\) is \(L\cup {M_m}\). Importantly, we note that for the purposes of scoring our strategy, it does not matter how \(L\) is ordered! For example, for the example ordering I gave earlier, we have the sets \(M={2,5,3,7,8,10}\) and \(L={2,5,3,7,8}\) with \(M_m=10\). \(L\) can be rearranged however, but we will still end up with the same \(\text{sum}(M)=35\).

Therefore, this problem becomes a subset-sum problem. Namely, we want to count the number of different \(L\)'s that sum to a specific target number. That is, given a truncated list of integers \([1,N]\backslash {M_m}\), how many subsets of that truncated list add to the target? The motivation is that if we set, say, \(M_m=10\), we know that the possible values for \(\text{sum}(L)\) must be between \(18\) and \(27\). So, we just solve the subset-sum problem for the range \([1,9]\) for each of those target sums in \([18,27]\). Then, we iterate through each of the different possible values for \(M_m\), giving us a complete accounting of \(M\).

Once this is done, we can use the same joker-insertion logic from Attempt 1 to translate the collection of \(M\)'s to a final expected score. Can this be considered an "analytic" solution? I don't think so as I can't seem to find an exact closed-form expression for the expected score.

I think this runs in pseudopolynomial (\(O(N^5)\) or \(O(N^6)\)-ish) time, stemming from the subset-sum algorithm. This code lets us get up to \(N=25\) with only about an hour or so of runtim

import math

# These functions are for setting up the subset sum algorithm

def generateTable(target, workingSet):
  workingSet = sorted(workingSet)
  table = [[0 for i in range(target+1)] for j in range(len(workingSet) + 1)]
  table[0][0] = 1
  for row in range(1, len(table)):
    for subtarget in range(1, target+1):
      if workingSet[row-1] == subtarget:
        table[row][subtarget] = 1
      elif workingSet[row-1] < subtarget:
        table[row][subtarget] = int(0<sum([table[j][subtarget-workingSet[row-1]] for j in range(row)]))
  return table

def generateSubsets(generatedTable, workingSet):
  table = [[None for i in generatedTable[0]] for j in generatedTable]
  table[0][0] = [[]]
  for row in range(1,len(workingSet) + 1):
    for column in range(1,len(generatedTable[0])):
      if generatedTable[row][column]:
        table[row][column] = []
        for preRow in range(row):
          if generatedTable[preRow][column - workingSet[row-1]]:
            for j in table[preRow][column - workingSet[row-1]]:
              table[row][column] += [j + [workingSet[row-1]]]
  return table

def generateMDict(generatedSubsets, target):
  mDict = {i:0 for i in range(N+1)}
  for row in generatedSubsets:
    if row[target] is not None:
      for i in row[target]:
        if i is not None:
          mDict[len(i)] += 1
  return mDict
  
# This function runs the main search

def runN(N):
  thresh = N*(N+1)/4
  totalScore = 0
  for Mm in range(1, N+1):
    workingSet = [i for i in range(1,N+1) if i != Mm]
    maxTarget = int(thresh)

    for target in range(max(int(thresh)+1-Mm, 0), maxTarget+1):
      mDict = generateMDict(generateSubsets(generateTable(maxTarget, workingSet), workingSet), target)
      for m in mDict:
        totalScore += math.factorial(m)*mDict[m]*(target + Mm)*math.factorial(N-m)
  return totalScore

for N in range(1,26):
  result = runN(N)
  print(f'N={N}; {result} total points; average score {result/math.factorial(N+1)}.')

Output:

N=1; 1 total points; average score 0.5.
N=2; 7 total points; average score 1.1666666666666667.
N=3; 48 total points; average score 2.0.
N=4; 384 total points; average score 3.2.
N=5; 3336 total points; average score 4.633333333333334.
N=6; 31704 total points; average score 6.29047619047619.
N=7; 330624 total points; average score 8.2.
N=8; 3761856 total points; average score 10.366666666666667.
N=9; 46402560 total points; average score 12.787301587301588.
N=10; 616838400 total points; average score 15.453102453102453.
N=11; 8797386240 total points; average score 18.366089466089466.
N=12; 134081533440 total points; average score 21.53221223221223.
N=13; 2175204810240 total points; average score 24.951221001221.
N=14; 37422413683200 total points; average score 28.617532467532467.
N=15; 680655583641600 total points; average score 32.53177933177933.
N=16; 13053127797964800 total points; average score 36.69831083948731.
N=17; 263242313297510400 total points; average score 41.11636174283233.
N=18; 5569244331923865600 total points; average score 45.782726252076095.
N=19; 123343025775943680000 total points; average score 50.69790125595079.
N=20; 2854167364105150464000 total points; average score 55.8644496026849.
N=21; 68881035148383338496000 total points; average score 61.282020061122225.
N=22; 1730753555261889134592000 total points; average score 66.94849275177044.
N=23; 45208431687862641623040000 total points; average score 72.86412788166052.
N=24; 1225861780409464256593920000 total points; average score 79.03069953826852.
N=25; 34460451676855139372236800000 total points; average score 85.44800720696871.

Regression

Running these results through a quadratic regression yields the following coefficients:\[\mathbb{E}(\text{score})\approx0.1251\cdot N^2 + 0.2879\cdot N + 0.061\]with \(R^2=1\). Regressions with higher-order polynomials yielded nonsignificant coefficients for terms greater than \(N^2\). Thus we can fairly confidently claim that there truly is an underlying quadratic relationship.

Limiting behaviour and analytic attempt

How does the expected score behave as \(N\) grows, and can the above regression be backed by theory?

We first consider, once again, a joker-less deck of \(N\) value cards. If we adopt our strategy of picking just enough cards to exceed the threshold of \(\frac{N(N+1)}{4}\) points, what is our expected score (ignoring the joker)? A naive answer, but still asymptotically accurate, is simply this threshold value. However, we can do better, because our score should always exceed the threshold value. We attempt to calculate this additional term by using some symmetry.

Consider an arbitrary ordering of \(N\) cards, of which the subset \(M\) is the first \(m\) cards that just exceeds the threshold. We can reverse the ordering, obtaining another valid ordering. In this reversed order, the first \(N-m+1\) cards will therefore be our chosen set. In both cases, we have chosen sets that include the terminal member \(M_m\). To explicitly illustrate, suppose we have an ordering \[(\color{red}{2, 5, 3, 7, 8, 10}\color{black},1,4,9,6)\] where the chosen set \(M\) is in red. Now, we observe its reverse \[(\color{red}{6,9,4,1,10}\color{black},8,7,3,5,2)\]Clearly, the two chosen sets share the value \(M_m=10\), but are otherwise disjoint. Furthermore, by symmetry, their union is the full range of \([1,N]\). This allows us to deduce that the expected score over all orderings should be given by \[\frac{1}{2}\Big(\frac{N(N+1)}{2}+\mathbb{E}(M_m)\Big)\]where the stuff inside the large bracket is the sum of the red numbers in the above two orderings.

Working out \(\mathbb{E}(M_m)\) exactly is tricky. Heuristically, I estimate that the probability mass function of \(M_m\) should be triangular; that is, the probability of a certain \(M_m\) is proportional to \(M_m\) itself. This is because the number of cases that permit a certain integer to be \(M_m\) is proportional to its value. For instance, in our \(N=10\) case, the ace (Card \(1\)) can only be \(M_m\) if my pre-threshold card sum was \(\text{sum}(L)=27\), increasing to \(28\) if I flip the ace. However, for the Card \(2\), it is able to be \(M_m\) if my pre-threshold card sum is either \(\text{sum}(L)=26\) or \(27\). Of course, this is not entirely accurate, but I hope in the large-\(N\) limit, this is a reasonably close estimate.

A triangular distribution like this has an expected value located at \(\mathbb{E}(M_m)=\frac{2}{3}N\). Subbing in this value into the expression above, we obtain: \[\frac{1}{2}\Big(\frac{N(N+1)}{2}+\mathbb{E}(M_m)\Big)=\frac{1}{4}N^2 + \frac{7}{12}N\]

We're not done yet! We still need to account for the Joker. Heuristically, I claim that in the \(N\to\infty\) limit, the joker should be picked around half the time. To justify this, I appeal once again to symmetry. Consider again the two paired orderings \((\color{red}{2, 5, 3, 7, 8, 10}\color{black},1,4,9,6)\) and its reverse \((\color{red}{6,9,4,1,10}\color{black},8,7,3,5,2)\). Suppose you slip the jokers into both orderings such that they remain reverses of each other (for example, slip it in between 7 and 3 in both). In one case, the Joker would sink your efforts, but in the other case, you are safe from the joker. Thus, for all such pairs, the joker is guaranteed to ruin your day in exactly one of the pair.

So, our expected score is half of the above expression, giving a final estimate of \[\color{red}{\boxed{\begin{align*}\mathbb{E}(\text{score})&\approx\frac{1}{8}N^2 + \frac{7}{24}N\\&\approx 0.125\cdot N^2 + 0.2917\cdot N\end{align*}}}\]

Fairly close to the regression! In fact, we can tabulate the relative error between estimate and actual result.

\(N\) Actual score Estimated score Relative error
1 0.5 0.41667 -16.67%
2 1.16667 1.08333 -7.14%
3 2 2 0%
4 3.2 3.16667 -1.04%
5 4.63333 4.58333 -1.08%
6 6.29048 6.25 -0.64%
7 8.2 8.16667 -0.41%
8 10.36667 10.33333 -0.32%
9 12.7873 12.75 -0.29%
10 15.4531 15.41667 -0.24%
11 18.36609 18.33333 -0.18%
12 21.53221 21.5 -0.15%
13 24.95122 24.91667 -0.14%
14 28.61753 28.58333 -0.12%
15 32.53178 32.5 -0.1%
16 36.69831 36.66667 -0.09%
17 41.11636 41.08333 -0.08%
18 45.78273 45.75 -0.07%
19 50.6979 50.66667 -0.06%
20 55.86445 55.83333 -0.06%
21 61.28202 61.25 -0.05%
22 66.94849 66.91667 -0.05%
23 72.86413 72.83333 -0.04%
24 79.0307 79 -0.04%
25 85.44801 85.41667 -0.04%

Looks good; the theoretical estimated score does seem to converge (slowly) onto the actual score. I don't think this is a particularly rigourous analytic attempt, but it'll do for now.