**Somewhere in the South Pacific in the Age of Sail…**

A ship full of blue-eyed sailors wrecks on an island which contains only brown-eyed -but friendly- natives. Our sailors settle down in the community, intermarry and have children. Generations latter a scientific survey team visits the island and finds that out of 1000 natives, 23 have blue eyes. From the oral history of the islanders and ancient relics from the ship itself, the scientists determine that approximately twenty generations have passed since the ship wrecked on the island. Additionally, the population of the island has held steady at approximately 1000 during this time period. How many sailors were on the ship?

**Somewhere in Indiana in the Information Age…**

A mathematician (that’s me!) ponders the problem and comes to some preliminary conclusions; being a mathematician he also makes several simplifying assumptions to make the problem more tractable. First, there is no ‘right’ answer. It is impossible to say with complete certainty how many sailors were on the ship but we can give an answer which is ‘most likely’. In the process we can probably estimate an upper and lower limit; that is with high confidence we can say that the number of sailors is between n_min and n_max. Second, we need to carefully translate the problem into something suitable for analysis which brings us to the next section. And finally, before going any further, I should mention that there are genetics principles (such as the Hardy-Weinberg Principle) which touch on this problem; I will not explore such perspectives and will instead employ a simulation approach, which is also somewhat more flexible.

**Dynamics of the Problem**

Assume eye color (blue or brown) is controlled by the factors A and B for blue and brown respectively. Three combinations are possible: AA, AB, and BB. Only AA will have blue eyes, all others have brown. A person receives one copy of the factor from each parent thereby producing a pair. So an BB dad and an AB mom can produce children which are either BB or AB, but not AA. We assume the factor from each parent is determined randomly (uniformly) so an (AB, BB) couple is equally likely to have a AB or BB. Note however that a (AB,AB) couple has a 50% chance of producing an AB child and only a 25% chance of producing an AA or BB child. All sailors are AA type and all islanders are initially BB type.

The above dynamics are essential for the problem, but to fix ideas we introduce a few other simplifications. First, as mentioned above the island population is assume to hold constant at 1000; each generation will pair off into exactly 500 couples, each of which have exactly 2 children. Apparently the parents perish immediately after producing twins, but such simplifications are typical when we seek to tackle a problem. These simplifications can be eliminated once we have a grip on the essentials.

**Preliminary Analysis**

Even with the assumptions above the problem remains quite challenging. Each sailor (we’re assuming they’re all men) will take a brown-eyed bride and live happily ever after in a tropical paradise – at least until they have twins. So the first generation of children will all be AB since this is the only possible offspring of a (AA,BB) couple. So all their children will be brown-eyed but, unlike the islanders prior to the shipwreck, they carry the ‘A’ factor and can produce blue eyed children of their own.

But after that first generation the situation becomes messy. Each child will marry another islander at random and produce two children. If it’s an (AB,BB) couple then only AB or BB children are possible but if two AB children marry then any outcome is possible, albeit not with the same probability. Now if we fix initial number of sailors, say twenty, then there will be 40 AB kids and it is possible to compute probabilities regarding the second generation (the sailor’s grandchildren). But while possible this is quite difficult to do exactly.

As an example, what is the probability of BLUE=40 (that is , 40 blue-eyed natives in the second generation) ? By conditional probability, P(BLUE=40) = P(BLUE=40|ALL MATCH)P(ALL MATCH) + P(BLUE=40|NOT ALL MATCH)P(NOT ALL MATCH). Here P(ALL MATCH) is the probability that all 40 AB kids intermarry within this same group producing 20 AB couples which then have a two AA kids apiece. The second probability in the equation above is zero since we can’t have BLUE=40 unless all match, so it becomes:

P(BLUE=40)=

Now we can do this for BLUE =0,1,2…40 but this is clearly a tedious and complicated business.

**Simulation**

So instead, we simulate the process using Python. I actually start at the first generation (the AB kids) since this captures all the information in the problem and we avoid having to write a special code to prevent an AA pairing in the first generation. I first write the code governing the dynamics of the factors’ inheritance, that is, the probabilistic rules regarding the outcomes of various couple pairings. Next, I write a function to create the next generation of islanders from the previous one. It does this by randomly pairing couples and producing two kids governed by the dynamics of inheritance, which depends on the couple of course

I should mention for those of you new to Python that a good way to do this is not to store a long list of ‘AA’, ‘AB’ and ‘BB’ strings, each representing an islander. It’s better to encapsulate the information in a simple dictionary storing the type and number of that type dic = {‘AA’:17,’AB’=187, ‘BB’ = 796}. One can always recover a list from this with the one liner [ d for d in dic.keys() for k in range(dic[d]) ] and use the list for the random pairings since random.sample takes a list as an argument.

After this basic setup we can run this 20 generation simulation a 100 or so times and adjust the starting value of ‘AB’ to find a value which produces the 23 observed blue-eyed islanders in the future. In my simulation this happens for about 300 ‘AB’ people, which would mean that the there were 150 sailors originally.

Many other interesting questions can be posed in this context. For example, consider the possibility of an extinction of the blue-eyed trait. This is likely if the initial number of sailors is too small. How many sailors are needed to ensure that 50% of the time the trait does not become extinct in 20 generations?

The code can also be modified easily to accommodate a non-constant population, one trait having larger average family sizes, pairings which favor same eye colors to some degree, etc… Well, that’s all for now, thanks for reading!