Thanks for responding to me again on Zhao et al (2000). I think our discussion on this paper continues to hold value because it is helping us both to engage directly with data. For the time being, therefore I would prefer not to move on to other papers (nor other topics, for that matter). I would remind you that when you introduced this paper there was no mention that it was a “weaker” source of evidence for your view. Indeed, it appeared to be one of the strongest contenders for an appropriate reference for your statement about allele counting methods in Adam and the Genome. The fact that you continue to think that their dataset and coalescent analysis does support your case is being very helpful in allowing us to come down to a detailed understanding about what evidence you think supports your case. It seems that you have an intuition that three successive mutations of an ancestral haplotype preclude a bottleneck of two in the human lineage. If this were so, then I can see why you would conclude that a bottleneck of two is impossible (with a high degree of certainty). Thus our discussion of this paper is helping me to understand your thinking better.
You are misreading my posts about the Zhao et al paper if you think I have “allowed myself” a time frame. The time frames I am pointing out are those that arise from their coalescent analysis, and thinking through how a bottleneck followed by a population expansion would affect this.
Please let me repeat my argument (already outlined above), based on Zhao et al’s own analysis. In their own analysis, all the mutations in the 10kb sequence have occurred within the last 712,000 to 2,112,000 years. The different haplotypes currently found in human populations all coalesce back to one haplotype within this timeframe according to their analysis. As I have pointed out, it is well known that in a coalescence analysis, it is the final coalescence events that take the longest time. In other words, the coalescence from two ancestral haplotypes to one ancestral haplotype takes longer than the coalescence from three haplotypes to two haplotypes. And the coalescence from three to two takes longer than the coalescence from four to three. And so on. So within their own analysis, this 10kb sequence would be down to four haplotypes within roughly 300,000-1,000,000 years before present. Thus in their analysis, three cumulative mutations have occurred in this space of time, and indeed, more (remember that there are also mutations that are present in one or two individuals that were not relevant to us when trying to figure our what the ancestral haplotypes could have been).
Their analysis is entirely reasonable. Let’s do a quick back-of-the-envelope calculation. If we say that there were four haplotypes 500,000 year ago, and call this 20,000 generations ago, in a 10,000bp region with a mutation rate of 1.1x10-9 mutations per bp per generation, with an effective population size of 10,000 in each generation, then we would expect around 2200 new mutations to occur in total over the 500,000 years. You will recall that the total number of variants that they found in the population was 78. So they can have many many mutations lost via drift, and still see the number of variants that they do.
Now, their analysis assumes a constant effective population size of 10,000. A bottleneck of two, followed by a population expansion to 7 billion individuals will obviously look rather different. The question therefore is: will a bottleneck followed by a rapid expansion increase or decrease the time from a coalescence of four haplotypes to the present? A bottleneck increases the rate of coalescence, as you know, which is why I have said that a bottleneck will decrease the likely timing of coalescence to four haplotypes from the present. I don’t make this point because I am restricting myself to a certain time frame, I am making this point because it is a simple fact about coalescence analyses. In other words: If there was a bottleneck in our past, all haplotypes in the present human populations will (on average) coalesce to four ancestral haplotypes in a shorter length of time than they would if the human population had a constant effective population size through history.
I think you agree with this point. However, your counter-argument is that low effective population size after the bottleneck will reduce the number of mutations that can happen.
Yes, in a smaller population size, a lower number of new mutations are possible in terms of absolute numbers. But we also have to take into account two things:
(1) a rapid expansion causes a higher proportion of new mutations to be preserved in a population than would be possible in a population of constant size. By virtue of the rapid increase of the population as a whole, new mutations will be held by higher and higher numbers of offspring. If the population expansion is accompanied by a geographical expansion, there is also an effect sometimes called “allele surfing” (reviewed here) which can push new alleles up to high frequencies in newly colonised areas.
(2) the low population size will only last a few generations - a rapidly expanding population will soon reach sizes of well over 10,000 individuals. For example, if the population doubles every generation, within 14 generations we will have 16,384 individuals. Thus in the course of human history, the low population size of the human population in the first few generations after the bottleneck will have little impact on the total number of mutations that are possible from the time of the bottleneck until now.
Therefore, it seems to me that your intuition that three cumulative mutations would be impossible (i.e. very very unlikely) after bottleneck of two early in the human lineage is a mistaken intuition. If your intuition were correct, then I would have to agree with you that a bottleneck was more or less an impossibility. But as far as I can see, your intuition is wrong, and Zhao et al’s own analyses show this.