In October of 2013, Titeuf24 posted a wonderful blog post calculating the odds of obtaining the nth comic cover in m boxes, updating the old probabilities in light of the release of the new cover progress bar. In the release notes, the devs stated:
“You have a chance to receive duplicate covers until you have received a number of duplicates equal to the number of unique covers you currently own … However, each duplicate you receive REDUCES the chance you will get a duplicate on your next lockbox opening."
I was curious about the size of this increased chance. Several others on the wiki have suggested that one logical mechanism for this would be to only allow a single duplicate once per new cover. It has been shown repeatedly that this is not the case. However, an optimistic model might posit that the odds for receiving a new cover are the same as if that was the case. To test this model against Titeuf24’s pessimistic model, I have constructed probability curves for these two models and compared them against the publicly available data in the 120 vs 10 experiment page. Here are my results.
First, here are the probability curves for each model:
As you can see, the optimistic model suggests you need about 20 fewer boxes on average and the distribution is much more narrow. The available experimental data is inconclusive:
The blue is the experimental data, the red is the optimistic model, the yellow is the pessimistic model. The sum of the square errors for the deviation from each model are, for statistical purposes, identical (48.7 in the optimistic model, 50.3 in the pessimistic model). The main reasons the results are inconclusive are because 1) there are only 34 participants and 2) the probability for the early boxes is mostly insensitive to the model that is used and the data is only available for the first 120 boxes. The most dramatic difference between the two models can be seen in obtaining the 8th cover:
Note that in the pessimistic model, the chance of obtaining the final cover on the 80th lockbox since the 7th cover is nearly 40% while the optimistic model suggests an even distribution across all sets of 10 boxes opened. Data from even 34 people would easily be sufficient in determining which of these two models is more likely or if the real probability is somewhere in between. So to determine the real probabilities, I’ve set up a survey and I’d appreciate you all filling it out for me. It is a very fast survey. It requests both the total number of lockboxes required to obtain the hero and the number of boxes required to obtain each cover. The cover information is much more valuable, but if you don’t have it, I can still use the total number of boxes.
Thanks for your help/interest.