Thanks for your kind words, Ted. I’ve learned a lot from you over the past several years - I’m glad that I’ve been able to return the favour.
Thanks for this Dennis! The work you’re doing here is timely and invaluable. I’m glad to see you thoroughly engaging Dr. Buggs’ points, and I also appreciate the distinction between the matter of Adam’s existence and of being sole-progenitor. Many of us have, at one time or another, believed and/or put forward the idea that scripture says Adam is the sole-progenitor. I have the impression that scripture does not actually make that claim. (although, I think the meaning of Eve being the “mother of all the living” deserves some explanation…maybe that’s been addressed somewhere else on this site.) Maybe we would have double-checked our exegesis sooner if we had thought sooner about the sort of science being highlighted in this article. These implications from population genetics might not be obvious to many in my generation, but I imagine they might be obvious to many in the next generation…a generation that grows up in a world of CRISPR innovations and consumer genetic analysis like 23andMe. I really would hate to set them up for an unnecessary science vs faith crisis. For that reason, I’m especially thankful for your work.
I’m currently still leaning toward affirming, not only that Adam was a real person, but more specifically the “de novo” creation of him. That seems like it’s compatible with what you’ve said, assuming Adam and Eve’s children intermarried into the existing hominid population.
Quick question. There’s something I’m trying to wrap my head around and I’m not quite succeeding. Richard Buggs is a biologist who seems very well steeped in genetics. He has a number of publications in plant genetics in respected journals, including one in Nature (albeit not as a lead researcher). Yet he seems to make a rookie mistake in confusing heterozygosity with genetic diversity. Which, having read your response seems exactly the case. How does this happen? How does one progress through years of study and research only to fundamentally misunderstand a basic scientific tenet in their specialized field? I’m at a loss, so hoping you can connect some dots for me here. Thanks!
Hi Dennis, Thank you for beginning to reply to my concerns. I very much look forward to continuing this discussion now that you have so graciously replied to my email. As you know, I blogged about this issue, reviewing chapter three of your book on 28th October at the Nature Ecology and Evolution Community. This allowed me to tackle the issue in more depth than I did in my email to you in May. This blog has already dealt with some of the issues you mention above. I have also responded to some comments on my blog at the Skeptical Zone here which provides further information. When I have more time I will post a longer response to your blog here on Biologos, but meanwhile, would refer you and your readers to the two links above. Best wishes, Richard
Thanks, Richard - and welcome to BioLogos! I very much appreciate your patience in waiting for this reply.
I have also published this text here in order to provide a stable and easily findable record: Response to Dennis Venema’s Blog “Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)” – Richard Buggs
I am glad that we are now establishing a dialogue about the scientific credibility of a bottleneck of two at some point in the history of the human lineage. I am hoping that during the course of this discussion we will be able to examine in detail the claims that you make in chapter three of Adam and the Genome, and that you will respond to all the critiques and questions that I have raised in my email to you and my blog at Nature Ecology and Evolution Community.
This Part I of your response is helpful in that it clears up some areas of potential misunderstanding between us, and points me to two arguments that are not made explicitly in your book chapter. I trust I can look forward to the subsequent Parts for your responses to the majority of the issues I have raised.
I will work through your blog in this comment, seeking to be as constructive as possible in my reading of it.
Scientific Confidence vs. Scientific Certainty
I am happy to take your point that you do not believe that science has DISPROVEN that a bottleneck of two individuals could have happened in the human lineage. Your position is that you are as certain that it has not happened as you are certain that the earth rotates around the sun. I am sorry if I mischaracterised your position as being more certain than it actually is.
In your blog you say: “I do not claim this [heliocentric level of] certainty for the oft-cited ~10,000 figure, as Buggs seems to imply”.
I am happy to take this point, but I should explain why I got the impression from your book chapter that you hold pretty strongly to the 10,000 figure. In your book chapter
you argue that multiple independent methods converge on a figure of 10,000, and even predict that one method that gives a lower figure is likely to be revised upwards. Here are the relevant quotations from your chapter:
it is worth at least sketching out a few of the methods geneticists use that support the conclusion that we descend from a population that has never dipped below about 10,000 individuals.Then you mention evidence from allelic diversity, and state:
these methods indicate an ancestral population size for humans right around that 10,000 figure.Then you present an argument from linkage disequilibrium, and state:
The results indicate that we come from an ancestral population of about 10,000 individuals— the same result we obtained when using allele diversity alone.Then you say more about linkage disequilibrium and state:
The researchers found that, during this period, humans living in sub-Saharan Africa maintained a minimum population of about 7,000 individuals, and that the ancestors of all other humans maintained a minimum population of about 3,000—once again, adding up to the same value other methods arrive at.Then you describe the PSMC method and state:
Taken together, this is in good agreement with previous, less powerful methods, with a combined minimum size of around 6,900 individuals. These numbers may shift upward, however, as we sequence more and more individuals from both groups.This is why I got the impression that you attached a high degree of certainty to the 10,000 figure. This impression came across especially strongly in the last two statements above, when you were (I think incorrectly, as I argue in my blog) adding up numbers from two populations to come to a 10,000 figure, and then suggesting that the PSMC method’s result might be revised upwards (towards 10,000, presumably). I think that if you re-read your chapter yourself you will agree that the 10,000 figure comes across quite strongly, and sounds to the reader like a very precise measurement of past human population size.
However, I am willing to take your point that you do not attach such a high level of certainty to the 10,000 figure as you attach to there never having been a bottleneck of two. That seems a reasonable position to hold.
Heterozygosity and population bottlenecks
The majority of your blog is taken up with the topic of genetic diversity. I think that we are largely in agreement here. I am glad that you agree with the points I made about the amount of heterozygosity that can be carried through a short, sharp bottleneck. I do not dispute that allelic diversity can provide stronger evidence for a past bottleneck than heterozygosity can. In my blog I stated this clearly: “A sharp bottleneck will affect allelic richness more than heterozygosity”. I am grateful that you have helped out non-scientists who are seeking to follow our debate by giving a simple “Genetics 101” explanation of why this is so in your blog.
Although we are in agreement about the relative merits of heterozygosity and allelic diversity in detecting bottlenecks, misunderstanding between us has arisen for two reasons: (1) ambiguous usage of the term “genetic variability” in your book chapter, and (2) the choice of Tasmanian Devils in your book chapter as an example of the consequences of a population bottleneck. I will explain both of these below.
(1) I commented on heterozygosity in my email and blog because in your book chapter you refer many times to “genetic variability”. As you know, in scientific population genetics literature the term “genetic variability” does not refer only to allelic diversity. Genetic variability of populations is measured in many ways: heterozygosity, allelic diversity, private allele frequency, gene diversity, fixation indices, inbreeding coefficients etc. I did not realise that when you use the term in your chapter you intend only to refer to allelic diversity. That is not the way the term is normally used in the field. I therefore assumed that you were also referring to heterozygosity. It is a pity that this ambiguity was present, but I understand that it is hard to write about science at a popular level without the occasional ambiguity slipping in that a specialist will stumble on.
(2) I also got the impression you are including heterozygosity within your definition of genetic variability because of your choice of Tasmanian Devils as an exemplar of a species that has undergone a bottleneck. This exemplar takes up quite a large proportion of the early part of your chapter. It is well known that Tasmanian devils have low heterozygosity as well as low allelic diversity - they have much lower levels of heterozygosity than humans (see this paper). The low heterozygosity within Tasmanian Devils appears to be partly responsible for the low fitness of their populations, likely due to several prolonged bottlenecks. In fact, you say of the Tasmanian Devils: “most of them have exactly the same alleles with only rare differences.” That sounded to me as I read the chapter to be popular-science-level statement that they have low heterozygosity.
For these reasons, I thought you were including heterozygosity in your chapter as one aspect of genetic variability. However I am willing to take your point that you were not, now that you have clearly stated this. I am happy to put this down to a communication issue. I misread your chapter, and did not realise you mean “allelic diversity” whenever you say “genetic variability”. I did not realise that when you bring up the example of Tasmanian Devils you are leaving to one side the issue of their low heterozygosity.
Now that we have cleared up this point, I think we can leave the issue of heterozygosity behind us, as we seem to be in full agreement about it.
Allelic diversity and bottlenecks
Now to look in more detail at the points you raise about allelic diversity. This is where I think your argument is strongest, so I would like to examine it in some detail. To do this full justice, I want to start with what you say about this in your book chapter. One of your most explicit statements about this in your book chapter is as follows:
…scientists have many other methods at their disposal to measure just how large our population has been over time. One simple way is to select a few genes and measure how many alleles of that gene are present in present-day humans. Now that the Human Genome Project has been completed and we have sequenced the DNA of thousands of humans, this sort of study can be done simply using a computer. Taking into account the human mutation rate, and the mathematical probability of new mutations spreading in a population or being lost, these methods indicate an ancestral population size for humans right around that 10,000 figure. In fact, to generate the number of alleles we see in the present day from a starting point of just two individuals, one would have to postulate mutation rates far in excess of what we observe for any animal.
As I note in my blog, you give no citation to the scientific literature to back up this point, so it is hard for me to interact with you on it. I would invite you again to make such a citation so that we can discuss this point further.
In your recent blog you have now made a similar claim, and given more detail:
So, a bottleneck to two individuals would leave an enduring mark on our genomes – and one part of that mark would be a severe reduction in the number of alleles we have - down to a maximum of four alleles at any given gene. Humans, however, have a large number of alleles for many genes – famously, there are hundreds of alleles for some genes involved in immune system function. These alleles take time to generate, because the mutation rate in humans is very low. This high allele diversity is thus the first indication that we did not pass through a severe population bottleneck, but rather a relatively mild one (estimated, as we have discussed, at about 10,000 individuals by current methods).
Would I be correct in assuming that this statement in your blog is intended to illuminate the passage I quoted above from your book chapter? If so, this is helpful as you give a link in the blog to an online primer about Human leukocyte antigen (HLA) genes, suggesting that your argument relates to these genes. But the online primer has nothing in it about models of past human bottlenecks. I would invite you to make a more explicit argument on this point, as I think this is the strongest argument that is available to you against a bottleneck of two. As you mention HLA genes in your blog, it sounds to me as if your argument may rest on Ayala et al (1994) but this paper was published before the human genome project, so I assume you must have a more up to date source that you drew on for your book chapter. Please could you let me know what it is so that I can follow up your argument?
I realise that some of your non-biologist readers may think I am being rather pedantic in asking for a citation when you are making what appears to be a very straightforward case from allele numbers. But biologist readers will know that very few things in this area are straightforward, and without a citation I have to treat your claims as unsubstantiated. For example, if your argument is from HLA genes, I have already mentioned these briefly in my blog, and why their rapid rates of evolution may prevent them from making strong argument against a bottleneck::
Hyper-variable loci like MHC genes [of which HLA genes are a type] or microsatellites have so many alleles that they seem to defy the idea of a single couple bottleneck until we consider that they have very rapid rates of evolution, and could have evolved very many alleles since a bottleneck.
Also as I wrote in a comment on the Skeptic Zone:
MHC loci are pretty exotic. Several studies show that they evolve fast and may be under sexual selection, pathogen-mediated selection, and frequency-dependent selection; they may also have heterozygote advantage (see e.g. http://rspb.royalsocietypublishing.org/content/277/1684/979). The maintenance of MHC polymorphism is still “an evolutionary puzzle” (https://www.nature.com/articles/ncomms1632). There is some evidence for convergent evolution of HLA genes (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1918223/, http://onlinelibrary.wiley.com/doi/10.1111/j.1600-065X.1999.tb01381.x/full, https://link.springer.com/article/10.1007/BF00189233, https://link.springer.com/article/10.1007%2Fs002510050028). If the whole case for large human ancestral population sizes rests on MHC loci, I think this is inadequate to prove the point, given our current state of knowledge on MHC evolution.
I look forward to hearing more from you on this topic in future blog posts.
Finally in your blog you make an argument from frequencies of rare alleles. This is an argument that is not mentioned in your book chapter, as far as I am aware. You state in your blog:
Another effect that a bottleneck to two individuals would produce is that there would be no rare alleles after the bottleneck. All alleles would have a frequency of at least 25%. As the population expanded after such an event, those alleles would stay common, and only new mutations would produce less common alleles. What we observe in humans in the present day is that many alleles are rare - even exceedingly rare. The distribution of alleles in present-day humans looks like it comes from an old, large population - not one that passed through an extreme bottleneck within the last few hundred thousand years, which is when our species is found in the fossil record. Thus the observation that we have many alleles of certain genes and the distribution of allele frequencies both support the hypothesis that humans come from a population, rather than a pair.
I agree with everything you are saying, up until the full stop after “exceedingly rare”. That is my understanding of the patterns of human genetic diversity also. However, beyond this point I need you to give a citation to the scientific literature to support your claims that the distribution of alleles in humans is inconsistent with “an extreme bottleneck within the last few hundred thousand years”. This is an interesting claim and one I would like to follow up, but without a citation this is an unsubstantiated assertion. I think I may have partly anticipated this argument in my blog when I wrote: “We need to bear in mind that explosive population growth in humans has allowed many new mutations to rapidly accumulate in human populations (A. Keinan and A. G. Clark (2012) Science 336: 740-743).”
I am grateful to you for beginning to respond to the objections I have raised to his book chapter in my email and Nature Eco Evo blog. I am glad we have cleared up the issue of heterozygosity and appear to be in agreement about it. I invite you to make clear citations to the scientific literature to back up several key points that you make in your book chapter and in this current blog. I note that you have not yet addressed the majority of my criticisms of your book chapter. I look forward to your responses to the objections I have raised to your use of: (1) the example of the Tasmanian Devils, (2) PSMC analysis, (3) the linkage disequilibrium study by Tenesa and colleagues, and (4) incomplete lineage sorting.
I have also published this text here in order to provide a stable and easily findable record: Response to Dennis Venema’s Blog “Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)” – Richard Buggs
Thanks for the reply. There’s more there than I can quickly respond to, and I’ll be busy for the next few days. We might end up talking past each other for a bit, as I am still working on the second part of my reply.
Some of the citations you’re looking for are just working familiarity with published data sets. Perhaps @glipsnort could also weigh in - he has discussed the allele frequency distribution here previously (and with nice graphs). The human allele frequency distribution as a whole is one very good piece of evidence that we do not come from just two people in the last few hundred thousand years. You could try messing about with a starting pair and mutation frequencies and see for yourself the challenge of generating the distribution we observe.
Also, keep in mind you’re asking about a bottleneck to two - not 2,000, not 200, not 20 … you get the picture. Moreover, you’re asking for a census size of two, not just Ne =2 . It’s a long way down from thousands to 2.
More anon - thanks as always for your patience.
Thanks for your brief reply. I look forward to Part II of your response.
It would be great if @glipsnort were to join in the discussion - I would like to see more detail of the argument from allele frequency distributions.
Some of the citations you're looking for are just working familiarity with published data sets.I think I must check at once that I am not misunderstanding or reading too much into your statement here. Do I understand you to be saying is that you will not be giving me citations to the peer-reviewed literature to back up certain of the claims in Adam and the Genome that I am querying? If so, I have to reassess somewhat my expectations for our discussion.
If you really are saying this, does it apply to this statement in chapter 3?
…scientists have many other methods at their disposal to measure just how large our population has been over time. One simple way is to select a few genes and measure how many alleles of that gene are present in present-day humans. Now that the Human Genome Project has been completed and we have sequenced the DNA of thousands of humans, this sort of study can be done simply using a computer. Taking into account the human mutation rate, and the mathematical probability of new mutations spreading in a population or being lost, these methods indicate an ancestral population size for humans right around that 10,000 figure. In fact, to generate the number of alleles we see in the present day from a starting point of just two individuals, one would have to postulate mutation rates far in excess of what we observe for any animal.This is a key passage in the chapter, and, as I have said before, I am very keen to read the details of the work.
No, I’m not saying that. I am saying this is my understanding of the published literature and the relevant publically-available databases. Li and Durban would be one paper relevant here; moreover the 1,000 genomes consortium papers, papers that estimate the present-day human mutation rate, and so on. For example:
Just an observation as a fly on the wall, but your style of communication here may be more offputting than you intend. I think your questions are fair and I share your desire in seeing your exchange with Dennis be as productive as we all hope. But you also seem to be issuing quite a lot of demands on exactly how this exchange needs to happen. As far as I know, it’s customary to make your argument, allow your conversational partner to make theirs in rebuttal, and continue as such as the issues are worked through. You seem to be wanting to direct exactly how Dennis needs to proceed as well as pressuring upon him a sense of urgency that at best seems unnecessary and at worst a little churlish or rude. Maybe this is just my own take and it’s not shared by Dennis or others reading this thread, but for what it’s worth…
That said, I’m looking forward to your continued exchange.
Asking for references and clarification is about as standard and civilised as it gets in the scientific literature - hardly churlish and it is odd to think it rude.
Simply asking for references or clarifications is not what I’m referring to.
I think it is easy to read unintended tone into other people’s words. As a moderator, I think both Dr. Venema and Dr. Buggs are modeling the graciousness we aspire to here. One person’s clarity and forthrightness is another person’s demands. Let’s not derail the conversation by expecting anyone to defend how their “tone” should have been read.
Thank you for such a quick response to my query, and thank you for the citation to Li and Durbin and to the 1000 genomes project.
I am saying this is my understanding of the published literature and the relevant publically-available databases.I had assumed that was the case as this is what one expects from a scientist. This is of course, why, as a fellow scientist, I am asking you - I hope courteously and professionally - to point me to the exact papers in the published literature, and to actual analyses of the public databases that support the claims you are making in Adam and the Genome.
Li and Durban would be one paper relevant hereThank you. As you know, this is the paper that presents the PSMC method. In my email to you and my blog I have explained why I do not think that the PSMC method is able to detect a short sharp population bottleneck. I assume that you are going to respond to my comments on PSMC in Part II of your response, so I will not press you further on this issue now.
moreover the 1,000 genomes consortium papers, papers that estimate the present-day human mutation rate, and so on. For example A global reference for human genetic variationI can see how the 1,000 genomes project can provide the raw data for an analysis such as the one I am asking you for clarification on - the one that you mention in the passage from your book that I quoted in my previous post (above).
However, as far as I can see, the 1,000 genomes paper does not do the calculations that you report in that passage. Unless I am missing something, the authors do not report a calculation of ancestral population sizes from the number of alleles found in present day populations. They do present several PSMC analyses (which are based on runs of heterozygosity within genomes) but they do not seem to present the calculation that you mention in the passage I quoted from Adam and the Genome. Is there another paper in which they conduct the calculations that you are telling your readers about? As I say, I am very keen to know what genes were used in these calculations and how they generated an ancestral population size of 10,000.
What I’m talking about there is a summary of the field as a whole - and the PSMC analyses in the 1,000 genomes paper is certainly one of the relevant experiments. So are LD studies. So are the Li and Durban PSMC results. All are based on examining present-day alleles in present-day populations (of course - it couldn’t be otherwise unless we’re talking paleogenomics). Some of these analyses use a forward mutation rate (which is also the rate of fixation for neutral alleles, and most variation is neutral). I know you think PMSC studies could miss a bottleneck to two. I disagree, but you’re going to have to wait until I have time to write it up and explain why I think you’re mistaken. Part II might include PSMC, or it might not. Part of my goal here is to not just explain my reasoning to you as a biologist, but to make it accessible to a non-specialist audience. That takes more time.
(For now I just want to comment on this – I should have more to add when I’ve run some simulations (which may take a few days).)
The Keinan and Clark paper is not relevant to the question at hand. The new mutations they describe are indeed rare: 80% of them have frequency < 0.05%. There is no question that large numbers of very rare variants can accumulate in a large, young population. It is the alleles at moderate frequency – roughly 5% to 20% minor allele frequency – that are difficult to explain with a recent bottleneck. As the authors point out in that paper, 92% of neutral alleles at a frequency of 5% are expected to be older than 10,000 years; that is not the historical period they discuss.
Looking forward to what you have to contribute, Steve.
I am very happy to wait for your comments on the PSMC method and why you believe that it would detect a sudden sharp bottleneck of two. Please don’t feel under any pressure; I appreciate your attempts to make all this accessible to non-specialist audiences. That is not always an easy task.
Regarding the passage from chapter 3 of Adam and the Genome that I am asking you for citations to support. You have responded in your comment above:
What I'm talking about there is a summary of the field as a whole - and the PSMC analyses in the 1,000 genomes paper is certainly one of the relevant experiments. So are LD studies. So are the Li and Durban PSMC results.I am sorry, I am struggling to follow you here. I'm afraid I can't see how that passage is a summary of the field as a whole, and therefore I don't understand how citations of the PSMC and LD studies support it.
Here is the passage that we are discussing in its context in Adam and the Genome: I have placed it in italics, and also added some emphases in bold.
...given the importance of this question for many Christians— and the strong insistence of many apologists that the science is completely wrong— it is worth at least sketching out a few of the methods geneticists use that support the conclusion that we descend from a population that has never dipped below about 10,000 individuals. While the story of the beleaguered Tasmanian devil provides a nice way to “see” the sort of thing we would expect if in fact the human race began with just two individuals, scientists have many other methods at their disposal to measure just how large our population has been over time. One simple way is to select a few genes and measure how many alleles of that gene are present in present-day humans. Now that the Human Genome Project has been completed and we have sequenced the DNA of thousands of humans, this sort of study can be done simply using a computer. Taking into account the human mutation rate, and the mathematical probability of new mutations spreading in a population or being lost, these methods indicate an ancestral population size for humans right around that 10,000 figure. In fact, to generate the number of alleles we see in the present day from a starting point of just two individuals, one would have to postulate mutation rates far in excess of what we observe for any animal. Ah, you might say, these studies require an estimate of mutation frequencies from the distant past. What if the mutation frequency once was much higher than it is now? Couldn’t that explain the data we see now and still preserve an original founding couple? Aside from the problems this sort of mutation rate would present to any species, we have other ways of measuring ancestral population sizes that do not depend on mutation frequency. These methods thus provide an independent way to check our results using allele diversity alone. Let’s tackle one of these methods next: estimating ancestral population sizes using something known as “linkage disequilibrium.”Then, after describing the LD study you write:
The results indicate that we come from an ancestral population of about 10,000 individuals— the same result we obtained when using allele diversity alone.A little later you write
A more recent and sophisticated model that uses a similar approach but also incorporates mutation frequency has recently been published. This paper was significant because the model allows for determining ancestral population sizes over time using the genome of only one individual. [You then describe the PSMC method.]
I am therefore struggling to understand how the passage we are discussing - the one in italics above - could be a “summary of the field as a whole” including linkage disequilibrium and PSMC methods. It seems to just be about the allele frequency method. You clearly distinguish the allele frequency method from the other methods. You say that the linkage disequilibrium method is “an independent way to check our results using allele diversity alone.” You say it gives “the same result we obtained when using allele diversity alone”. You describe the PSMC methods as “A more recent and sophisticated model”.
I am sorry that I am spending so long on this point - this really is not where I had expected our discussion to go. I thought I was making a very straightforward request when I asked for a citation for the calculations in this passage. I am still hoping that you may be able to, now I have reminded you of the context of the passage. I appreciate that it may be a while since you re-read the chapter for yourself, and your recollection of what you wrote could be different from the text of the book. I know that I am sometimes surprised when I re-read something that I wrote myself after several months away from it.
Allele-based methods: 1000 genomes (including their PSMC), and understanding allele frequency distribution and mutation frequency/fixation
LD: independent of mutation frequency
"recent and sophisticated" = PSMC on single individuals (a specific case of allele methods that is somewhat distinct from the prior PSMC work)
So you’re right - it’s a summary of allele methods, including PSMC, interspersed with the discussion on LD, and then back to a special case of an allele method with the use of PSMC on single genomes. That summary doesn’t include LD. I haven’t read over that section in some time. Hopefully that clears it up.
Another thing to keep in mind is that the vast majority of scientists are not at all interested in (or likely aware of) what evangelical Christians want to “see” from their data. It wouldn’t even cross the mind of a group to publish a paper that specifically tackles the question of all humans descending uniquely from just two people. This wouldn’t even be on their radar because none of the evidence we have accumulated in the last 30+ years even remotely suggests it.
So, you’re not going to see that specifically addressed in the literature. What it takes is people who are tuned to those questions who can interpret the literature in light of those issues.