Dennis, it is fascinating to see how your use of this paper is evolving as this discussion continues. You first proposed it to me as a citation to back up the piece in your book about allele counting. Then you described it to @tallen_1 as follows
Now that I have pointed out to you that neither of the parts you highlighted are good evidence against the bottleneck of two hypothesis, you are focusing on the final part of the paper which is not mentioned in the part of the abstract that you copied and pasted for @tallen_1. Now you are saying:
You are quite wrong to think that 75 variants within a 10,000 base pair region could not pass through a bottleneck of two. The 75 variants are all different sites, each of which has two alleles. All 75 of those variants could, in theory, be contained within a single individual. That individual would be heterozygous at the 75 sites. Thus all 75 variants could easily pass through a bottleneck of two. Indeed, a thousand could.
In your comments above, you are writing as if only one allele could pass through a bottleneck of two. This would be true if a bottleneck of two were maintained for many generations, as the extreme inbreeding would eventually eliminate all genetic variation. Thus all heterozygosity would be lost. But, as we have covered before in our dialogue, only 25% of heterozygosity is lost in a short sharp bottleneck of two followed by a rapid expansion. We seem to be right back at the beginning of our dialogue. This is the first thing I pointed out to you, right back in the spring.
In reality, in the case covered in this paper, not all 75 variants would need to be carried through the bottleneck. 24 of them were mutations that were found in only one sequence in the sample, 22 were found in two sequences, and 32 were found in more than two sequences. The 46 variants that were found in two or less individuals are likely to be fairy recent mutations, leaving only 32 variants that could perhaps have come through the bottleneck. It could have been less, of course, depending on how long ago the bottleneck was - that is not something I am taking a strong position on.
I donāt understand how you could have thought that if you understood what Steve was modelling. I challenge you to explain in your own words (without help from Steve) what Steve was modelling, and how it shows that a bottleneck of two is unlikely to have happened.
Richard - are you aware how closely linked those variants are? Theyāre at most 10,000 bases apart. Are you seriously suggesting that they passed through a bottleneck en masse in two individuals and then recombined to the forms we see now?
Hey, why donāt you take a break for a while? All I see you doing is running the scientists ragged, hashing through their work ā¦ all to have them defend their work from a dozen different angles - - on a conclusion that you will probably never accept anyway.
I think most everyone would like to see you at least do some mathematical scenariosā¦ Excel is quite capable of representing generational change, where each row of cells can represent each new generation from a single mating pair in your scenario.
Do some benchmarkingā¦ make the numbers work the way your mind thinks they should work.
I really appreciate everyoneās efforts to work through this topic. However, itās certainly difficult at times to follow along for a non-specialist. For instance, phrases like āTajimaās Dā are technical terms familiar only Iād presume to insiders in population genetics. Not the public at large. And points such as you made that 75 allelic variants being at most 10,000 bases apart would fly over most peopleās heads (it certainly did mine) over why this has relevance to a bottleneck. The rest of us donāt have the background knowledge sufficient to determine which arguments are well supported and well reasoned. And which arenāt. So someone like me who is inclined to put more weight on the expert consensus will default to thinking youāve made the better argument. While those within the ID community who feel the expert consensus is based on faulty presuppositions or some amount of group think, will feel Richard has the better argument. Iād like to see everyone avoid that, since we are dealing with objective data here and not subjective intuitions. So is there a way to make this more digestible to the non-specialists on this thread? Thanks!
This is why I have asked Richard a single question, the answer to which will explain to me whether or not this discussion is one I need to be following, or whether itās basically irrelevant to what I believe.
I wanted to put a shot across Richardās bow, as it were, that his idea of packing a large number of variants into two individuals just doesnāt fly, in terms another biologist would understand. I didnāt have time then, nor do I now, to unpack that for everyone else. But I will, later today. I was hoping Richard might give some attempt at a justification, but as it is really off the map, I donāt see how that is possible. More later.
If anyone is feeling a little badly due to the conversation going over their head, donāt. I have a PhD in biology and much of this is going over my head. This is what I can summarize, though:
Scientists that do this for a living have concluded that it is highly unlikely that the human population was ever as low as 2 (or 8, as related to Noah).
@DennisVenema had some strong words (and possibly too strong) in his book about this observation.
@RichardBuggs and others objected to the strength of the language, contending that it is not impossible to rule out an ancestral population of 2.
@glipsnort (a rather notable scientist in this field) was nice enough to take the time to run some time-consuming simulations that supported exactly the strong consensus of the many experts in this field.
The only thing I would add is that some of the papers I have directed Richard to, such as the Alu paper and a few others, have not, as of yet, been responded to by Richard. These papers also test Richardās hypothesis of a bottleneck to two, and reject it.
Ok, Linkage disequilibrium - here we go. This will probably steal some thunder from my full treatment later, but hereās a sampling.
In a bottleneck to 2 people, four versions of any given chromosome pass through the bottleneck (two in each person). Each chromosome has variants on it in a particular pattern. These four chromosomes are now what are called haplotype blocks - groupings of alleles physically linked together.
After the bottleneck, recombination through chromosome breakage and rejoining (ācrossing overā) will be necessary to start mixing and matching the four sets into new patterns. The closer together two variants are, the less likely a crossing over event will occur between them.
The overall, average recombination (crossing over) rate in humans is around 1% per generation for every million base pairs. This rate can and does vary somewhat across the genome, but the variants we are discussing here are really, really close together - 1000 base pairs apart or less. Crossing over between two such alleles is thus really rare.
You can see the region that Richard and I are discussing here if you want to look at the raw data. (Youāll have to play around with the default settings to show all the variants. Click on ātracksā and then āvariationā to see the alleles.)
If we were to pack a significant number of alleles into two people, such that the alleles would survive the bottleneck, that would place those alleles into four sets of very tightly linked variants, or haplotype blocks. As the population expands exponentially after the bottleneck - required in order to save as much variation as possible in this scenario - the vast, vast majority of offspring would inherit one of the four blocks without any crossing over. This would continue for generations, with only extremely rare crossing over events eventually breaking up the haplotype blocks.
Once those new, rare, recombinant offspring arise their new haplotype blocks will have to drift up to the intermediate frequencies we see in some cases for such blocks. All of the issues facing drift for individual alleles (as Steve has modelled for us) also apply to new blocks.
The net effect is that the resulting population would be heavily biased towards the starting four haplotype blocks, with all the variants packed together, and there would be fewer haplotype blocks that arose through crossing over.
When we look at this region we donāt see what a bottleneck to four would predict. We donāt see all the variants grouped together into four different haplotype blocks. We see the variants dispersed in different combinations. What we see just doesnāt fit a two-person bottleneck model.
This sort of analysis was done at a massive, genome-wide scale by Tenesa et al, 2007, and they conclude that human population sizes have stayed in the several thousands over the last 200,000 years, as I discuss in the book. This group looks at millions of marker pairs, many of which are much further apart than the ones in this small region of the genome we are discussing here.
No amount of juggling of recombination rates will get these sorts of studies down to 2. To get down to two requires special pleading in the extreme.
What amazes me in all of this is that even with all the time and effort put into this conversation, there hasnāt been a single acknowledgement from Richard Buggs that he was wrong or even a mere acknowledgement for that matter of the evidence that has been presented to him.
He hasnāt even issued a thank you for all the time that you and Steve have put into answering this.
Iām hoping he doesnāt dig in his heels and prove to be immune to evidence, but I fear he has done just that.
I tend to see things a little differently, as there has been a good mutual exchange of ideas and interpretations. We are then capable of looking at the evidence and making our own judgements regarding their validity. In exchanges like this, we seldom see a situation where positions are changed remarkably, but it does help clarify the issues and provides an opportunity for all involved to express their thoughts to our benefit. My thanks to Dr. Buggs and Dr. Venema for their contributions.
He certainly has. He has thanked them repeatedly. Hereās an example.
Hereās another example.
Hereās another one.
Two more times in this post.
More thanks here.
Here also.
Richard has been very courteous, grateful, and appreciative. The entire discussion has been complex and difficult, with occasional obvious frustration, but the level of decorum has been extremely high.
Iāve been going through my sofa, trying to pull together the purchase price. Iām up to $3.23, 2 safety pins and three āfun sizedā Charleston Chews. Iāll trade the pins, the candy, or both (!) for the remaining $56.77.
Hi all,
I have been away over the weekend visiting my parents with my wife and son, and have just got back to this discussion today. I see that @cwhenderson has made some comments designed as a summary of where things are at. For my part I would briefly say this.
I came into this discussion with two questions in my mind:
Does chapter three of Adam and the Genome make a convincing case that humans have never passed through a bottleneck of two?
Is there a better case that can be made?
I came to both of these questions with an open mind. Regarding question (1) I had very serious concerns about Dennisā chapter, which I expressed to him privately in an email in the spring, and then later made public. I did think that Dennis might be able to defend his chapter well when challenged and come up with missing references. So far, however, I have been disappointed with his defence of his chapter (thought grateful that he has made it), and I have growing certainty that even if his conclusion - that there never was such a bottleneck - is correct, it is correct for the wrong reasons. Regarding question (2) Dennis has cited some papers that were not alluded to in his book, which I still need to read. Given my disappointment in the other citations that Dennis has pointed me to in the past through his chapter and in this discussion, I am not optimistic that any of these will really support his case when examined closely. But I will read them in case they do prove his point. In addition, Steve Schaffner has given some evidence from allele frequency distributions that these make a case against a bottleneck of two. This is an intriguing argument, but I have questions that have not been fully dealt with regarding the role in the model of alleles that are derived from before the bottleneck, and also the effect of population structure. I also note that because Steve has presented this analysis that is not in the peer reviewed literature, it seems probable that he does not find what is already in the peer reviewed literature as convincing as Dennis does. I am also getting less optimistic that there is a strong case to be made against a bottleneck of two because I have not had any responses to my blog at Nature Eco Evo - which was aimed at research scientists and has been read many times - saying that I am clearly wrong and have missed a crucial piece of evidence that shows that a bottleneck of two is effectively disproven. I maintain an open mind on this wider issue however.
The reason why I stopped asking @glipsnort questions was because he was taking some time out over Thanksgiving, and he has not yet responded to the questions I posed just after he left the discussion.[quote=āDennisVenema, post:119, topic:37039ā]
The only thing I would add is that some of the papers I have directed Richard to, such as the Alu paper and a few others, have not, as of yet, been responded to by Richard. These papers also test Richardās hypothesis of a bottleneck to two, and reject it.
[/quote]
I will come on to those papers in due course, Dennis, as we have not yet completed our discussion of Zhao et al 2000, which is the first one of the batch of papers that you directed me to that I have addressed. As we have discussed, the parts of this paper that you highlighted did not test a hypothesis of a bottleneck of two and do not support your certainty against this hypotheses. I think you have agreed with that critique and we are now in the midst of a discussion as to whether or not the final coalescent analysis of the paper is persuasive evidence against a population bottleneck of two.
I have just had a read through the comments that you made about this since I have been away for the weekend. I am glad to see that you have backed away from your claim that 75 variants could not make it through a bottleneck of two, and that you are now talking about how they are arranged in haplotypes over the 10Kb region. That is much more appropriate and to the point. However, you present no analysis of the haplotype structure of the region we are discussing, nor do the authors Zhao et al write about this in any detail. You are quite out on a limb here, making claims about the data that the authors do not make.
Please could you present some analysis of their data and show that there are not four major haplotypes for the higher frequency variants that could have come through a bottleneck? As you know, linkage disequilibrium in the human genome means that very few of the haplotypes that are possible from existing SNVs are present in human populations.
Please note that I have already commented on your use to the Tenesa et al paper in my Nature Eco Evo blog, and I am looking forward to your response to my critique.
I would also note, Dennis, that as well as the outstanding task you have given me of reading though four more papers (which I plan to fulfil), I have also suggested a task for you that I believe remains outstanding:[quote=āRichardBuggs, post:108, topic:37039ā]
to explain in your own words (without help from Steve) what Steve was modelling, and how it shows that a bottleneck of two is unlikely to have happened.
[/quote]
I am sure that many of the readers following this discussion would welcome such a summary.
Iāll let @glipsnort comment on his confidence re: the data as we have it, but donāt forget heās also waiting for evidence of a turtle rather than a duckā¦
Iām also hoping that eventually youāll take this on yourself too - youāre a biologist, after all, so this should be possible for you to work on this for yourself. The 1000 genomes data set would be the logical place to start. How about a model that proposes and explains the linkage disequilibrium data coming from just 2 people? I think if you started working on that angle you would quickly see the problems. Itās not without reason that both Steve and I have been asking you to do some modelling for yourself.
Eventually I hope to have a full reply to you done. Iāve given Part 2 to Brad and Jim, and Iām waiting for their feedback. Pending that Iāll draft part 3. Iām also unusually busy this week with a number of events, so, donāt hold your breathā¦
I also donāt know why you keep saying things like I donāt allude to the papers Iāve cited to you in the book. Do you really think I would write a book for a popular audience and just make a guess at what the data says? Or is it more likely that I would read the evidence at some depth before writing the book?
Iām also surprised that you havenāt already read those papers. If I was in your shoes, I would have familiarized myself with the field as a whole before mounting a public critique.