Hi Joshua,
This is just going to be a rather brief holding response as I have only got part way through my first read of the ARGweaver paper, and a series of mini-crises in my lab are taking up much of my time now, so it may be a while before I can give you a fully considered response. I don’t want to leave you waiting for too long, so here is a quick reply, with all the shortcomings that this necessitates.
I agree with you that the ARGweaver results, given the assumptions and simplifications behind its analyses, does appear, on your further analyses of its graphs, to give a reasonable bound of a bottleneck of 420 kya +/- 100 Kya. I don’t say this as someone who has worked through all of the analyses for a second time: I just say this as what you have described seems to me to be reasonable. To be perfectly honest, I am quite surprised at how low this figure is. If you had asked me to guess beforehand I would have probably suggested a higher figure.
Having said that, on my rather shallow reading of the work so far, I would be slightly cautious, in that just having 4 lineages left in a population does not mean that those 4 lineages are all found in just two individuals, as I think you have already pointed out. Demonstrating that only four alleles are left in a population is a necessary pre-requisite of a bottleneck of two, but is not in itself evidence that a bottleneck of two actually occurred. For the four alleles to coalesce to the point of being in the same human bodies may take quite some while and could add a bit to to the timing (This depends on effective population size, of course, as has been frequently noted in the discussion above). I think I would therefore have more confidence in your lower bound than your upper bound.
My point "I do think that the coalescent models used in a test of the bottleneck hypothesis would need to include the effective population size decreasing down to two as we go back in time. " Was a reiteration of a point that I have made several times before in the discussion above when discussing coalescent analyses in the Zhao et al paper.
In the ARGweaver paper, in a footnote to Table 1 the author’s write “Model allows for a separate Ni for each time interval l but all analyses in this paper assume a constant N across time intervals.” It sounds to me as if they use a constant Ne. I have to admit that I find the paper rather confusing on the point of effective population sizes, but you have spent longer than I have working out exactly what they did, so I look to you for enlightenment.
[By the way, I was reflecting on my point “I do think that the coalescent models used in a test of the bottleneck hypothesis would need to include the effective population size decreasing down to two as we go back in time” after I had posted it, and I have a caveat about this. No method of estimating Ne based on genetic diversity (that I am aware of) is capable to identifying a short sharp bottleneck of two as an Ne of two. That is because every method (that I know of) would need the population size to remain constant for at least a few generations before it could estimate an Ne of 2. Thus, I think we can safely say that when effective population size is defined by an equation based on genetic data, a single generation of census size two does not have an Ne of 2, but of a higher number (exactly what I don’t know - I guess it would depend on the size of the pre-population bottleneck and the rate of population expansion afterwards). I have - in effect - made this point before, but have never quite formulated it in my mind in these terms, so thought it might be worth sharing for discussion/correction.]
I think the most important take home message for me from your ARGweaver analyses is that (as far as I can see) you have shown nicely that genome-wide allele counts do not provide evidence that a bottleneck of two has not happened in the human lineage. That is a real step forward in our understanding of this area. Thank you!