22 February 2022

Fiscal fibrillations (18/02/22 Riddler Classic)

Original problem

If we start with $n$ coins, the first toss will reduce the number to $k$ which is distributed binomially. If $p_n$ is the probability that there exists a lucky coin if we start with $n$ coins, then a recurrence relation exists:$$p_n=2^{-n}\sum_{k=0}^n{n \choose k}p_k$$where $p_0=0$ and $p_1=1$ are the base cases. Isolating for $p_n$ gives $$p_n=\frac{1}{2^n-1}\sum_{k=0}^{n-1} {n \choose k}p_k$$

Plotting $p_n$ up to $n=10$ generates the following graph which seems to imply $p_n$ converges.

However, visual inspection of $p_n$ values up to $n=1000$ shows a clear oscillatory tendency

which can also be plotted on a linear-log scale:

The periodic nature on a log-scale makes sense. Given a large enough $n$, we expect the first toss to almost-exactly halve the number of coins, and so $p_{2n}\approx p_{n}$, with the approximation being better as $n$ increases by the law of large numbers. However, questions remain:

At the moment, I can only answer the first dot point.

$p_n$ oscillates around $\frac{1}{2\ln(2)}\approx0.721348$

To arrive at this value analytically, we want to reframe how we express $p_n$.

Suppose we modify the game slightly such that we toss all $n$ starting coins with every toss, even coins that came up tails; we just make those coins ineligible to be the lucky coin.

Now suppose a lucky coin is crowned after exactly $(t+1)$ tosses. This happens if and only if both of the following occur:

Hence the probability that a lucky coin is crowned on the $(t+1)$th toss is the product of the above two probabilites, summed across all $k\in[2,n]$.

$$\begin{align*}\sum_{k=2}^n{n\choose k}(2^{-t})^k(1-2^{-t})^{n-k}2^{-k}k&=\sum_{k=2}^n{n\choose k}(2^{-(t+1)})^k(1-2^{-t})^{n-k}k\\&=\sum_{k=0}^n{n\choose k}(2^{-(t+1)})^k(1-2^{-t})^{n-k}k\;\;\;-n2^{-(t+1)} (1-2^{-t})^{n-1}\\&=n2^{-(t+1)} (1-2^{-(t+1)})^{n-1}-n2^{-(t+1)} (1-2^{-t})^{n-1}\\&=\frac{n}{2}2^{-t}\Big[(1-{2^{-(t+1)}})^{n-1}-(1-2^{-t})^{n-1}\Big]\end{align*}$$

Note that in the third line, we used the identity $\sum_{k=0}^n {n\choose k}x^ky^{n-k}k=nx(x+y)^{n-1}$. Now, summing the above over all possible values of $t\in[0,\infty)$ gives $p_n$

$$\begin{align*}p_n&=\sum_{t=0}^\infty\frac{n}{2}2^{-t}\Big[(1-{2^{-(t+1)}})^{n-1}-(1-2^{-t})^{n-1}\Big]\\&=\frac{n}{2}\sum_{t=0}^\infty 2^{-t}(1-2^{-(t+1)})^{n-1}-\frac{n}{2}\sum_{t=0}^\infty 2^{-t}(1-2^{-t})^{n-1}\\&=\frac{n}{2}\sum_{t=1}^\infty 2^{-(t-1)}(1-2^{-t})^{n-1}-\frac{n}{2}\sum_{t=0}^\infty 2^{-t}(1-2^{-t})^{n-1}\\&=\frac{n}{2}\sum_{t=0}^\infty  2^{-t}(1-2^{-t})^{n-1}\end{align*}$$

We can approximate this sum by an integral, which luckily for us is actually extremely easy to perform.

$$\begin{align*}p_n&\approx \frac{n}{2}\int_0^\infty 2^{-t}(1-2^{-t})^{n-1}dt\\&=\frac{n}{2}\int_0^\infty e^{-t\ln(2)}(1-e^{-t\ln(2)})^{n-1}dt\\&=\frac{1}{2\ln(2)}\Big[(1-e^{-t\ln(2)})^n\Big]^\infty_0\\&=\color{red}{\boxed{\frac{1}{2\ln(2)}}}\\&\approx 0.7213475\end{align*}$$

This doesn't actually tell us if the sequence $\{p_n\}$ actually converges, however; it is merely an approximation for any individual $p_n$. The sinusoidal behaviour also isn't particularly apparent. Without knowing the answers to these, I'm reluctant to make a call on what $p_{10^6}$ might be.