17 August 2021

Original puzzle

It's very obvious that the optimal strategy is as follows. Suppose we have $N$ dice, which we toss. Then:

1. Freeze the highest-scoring die
2. For the remaining $N-1$ dice, we must determine which ones to rethrow and which to freeze. We check every possible freezing pattern (obviously freezing from large to small), and rethrow the number of dice that maximises the expected score.

This strategy is fairly straightforward to implement, if not particularly elegant. Running an exhaustive search of all possibilities returns the following expected score $S_N$ for some low $N$.

$N$ $S_N$
$1$ $\frac{7}{2}=3.5$
$2$ $\frac{593}{72}\approx 8.2361$
$3$ $\frac{13049}{972}\approx 13.4249$
$4$ $\frac{989065}{52488}\approx 18.8436$
$5$ $\frac{1108166095}{45349632}\approx 24.4361$
$6$ $\frac{332273594663}{11019960576}\approx 30.1520$
$7$ $\frac{4621176159903031}{128536820158464}\approx 35.9522$

Interestingly, analysing the specific playthrough for each starting dice configuration, we note that $1$'s are always rethrown (if possible), while $6$'s are never rethrown. We may even be tempted to write this down as true for all $N$. After all, it makes perfect sense, right? The former is easy to understand; there is never a situation in which we should settle for a $1$.

The latter, however, is a little more difficult to intuitively justify. It it really true that we should freeze all $6$'s? Heuristically, we observe that the more dice there are to play with, the higher our expected score, because we have more turns to "optimise" our score. So, perhaps it is worth sacrificing a guaranteed 6 in order to buy more turns for the other dice, hoping that they'll increase in value.

### Freezing all $6$'s

Let's adopt a strategy in which we freeze all $6$'s. For simplicity, let's also choose $N$ large enough such that we rethrow all $5$'s or lower if we can (this in itself is difficult to justify). Two different scenarios can occur for each configuration of $N$ dice.

1. There are some $6$'s. Under our strategy, we freeze all of them and nothing else
2. There are no $6$'s. In this case the clear strategy is to freeze only the highest dice (since we want to preserve as much free dice as possible)
###### Scenario 1

Suppose within $N$ dice, there are $k$ $6$'s. Our strategy tells us to freeze all of them, leaving $N-k$ dice to rethrow. This yields an expect score of $S_{N-k}+6k$. Given that there are ${N\choose k}5^{N-k}$ configurations with $k$ $6$'s, we can simply sum this up over $k$:

\begin{align*}&\;\;\;\;\;\sum_{k=1}^N{N\choose k}5^{N-k}(S_{N-k}+6k)\\&=6^N N + \sum_{k=1}^N{N\choose k}5^{N-k}S_{N-k}\end{align*}

##### Scenario 2

Supppose the highest die in a certain configuration of $N$ dice is a $1$. There are $1^N$ configurations that satisfy this. The expected score, after freezing one of these $1$'s and rerolling the rest, will thus be $1^N\cdot(S_{N-1}+1)$.

Now suppose the highest die is a $2$. There are $(2^N - 1^N)$ configurations that satisfy this. The expected score, after freezing one of these $2$'s and rerolling the rest, will thus be $(2^N-1^N)\cdot (S_{N-1}+2)$.

Continuing in this way, suppose the highest die is a $5$, with $(5^N-4^N)$ configurations. The expected score, after freezing one of these $5$'s and rerolling the rest, will thus be $(5^N-4^N)\cdot (S_{N-1}+5)$.

The sum of all these expected scores turns out to be \begin{align*}&\;\;\;\;\;5^NS_{N-1}+5^{N+1}-4^N-3^N-2^N-1\\&=5^N(S_{N-1}+5)-\sum_{k=1}^4 k^N\end{align*}

##### A surprising twist

So, combining the two scenarios, and weighting each dice configuration with $6^{-N}$, we arrive at a recursive expression for $S_N$ that is valid if our strategy holds:$$S_N=N + 6^{-N}\Big[\sum_{k=1}^N{N\choose k}5^{N-k}S_{N-k}+5^N(S_{N-1}+5)-\sum_{k=1}^4 k^N\Big]$$

Here, we ask whether there exists $N$ such that $S_{N+1}> S_N+6$. If this were the case, then we will have frozen too many $6$'s, because letting one roam free yields a better score than freezing it...

... and indeed, the differences between successive terms of $S_N$ hover very close to $6$, until at $N=124$, we have $S_{124}=737.4020240410657$ and $S_{125} = 743.4020240410659$. This difference is $6 + 2\times 10^{-13}$.

Now, whether we can trust this is questionable, as my code that executes the above recursive expression probably runs into floating point errors. But, what this means is that it is very possible that our simplistic strategy of freezing all $6$'s and nothing else is not the optimal strategy.

It's probably possible to write better code to execute the above (namely, using rational numbers), which I will try to do. Hopefully that'll eliminate any pesky floating point errors.

##### Underperformance

A strategy that freezes all and only $6$'s seems to underperform the actual real (boring) strategy as outlined at the start. We use Goh Pi Han's list of calculated $S_N$, which we use to populate our $S_N$ up to $N=15$, we see that the freezing-6 strategy ever-so-slightly underperforms.

$N$ Actual $S_N$ Freezing-6 $S_N$
$16$ $89.4486902517831$ $89.4486902487524$
$17$ $95.43827743250172$ $95.43827743124307$
$18$ $101.43026786811494$ $101.43026786705988$
$19$ $107.42408670844682$ $107.42408670725361$
$20$ $113.41930274677618$ $113.41930274549294$
##### Approximation

Let's ignore the uncomfortable reality that freezing all 6's might not be the best strategy. But, if it is a good approximation, then the expected score for large $N$ may well be something like $$S_N\approx 6N - 6.597976$$