Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)

Because, as I point out to George above, if the authors can exclude a bottleneck for non-Africans they can exclude one all the more for Africans, which have more diversity. [quote=“RichardBuggs, post:86, topic:37039”]

The authors estimates of human effective population size are between 8100 and 18800. These estimates are based on present day numbers of segregating sites in the sample sequences, and estimates of mutation rate. This method assumes a fairly constant population size over time. Thus, although they are estimating present effective population size, they are happy to extrapolate this into the past, and call their estimates “long-term effective population size”. This phrase is essentially an expression of their assumption, which is necessary for their method, that population size has remained fairly constant. They say in their discussion: “The lowest value (8,100) suggests that the long-term effective population size of humans is unlikely to be lower than 5,000” but this is 5000 figure is not supported by a calculation: it is seems to be a figure chosen for being a round number. “Long-term” is not defined in terms of number of years. No historical reconstruction of effective population size at different time-points in history is given. I struggle to see this paper as evidence that there was never a short sharp bottleneck in human history.
[/quote]

Have a look at Table 5, which shows their data for the distribution of TMRCA values. This is the data and analysis they are basing their conclusions on. Bottlenecks increase the probability of coalescence (this is also how PSMC methods work). We see a distribution of TCMRA values for the alleles in the study. This is basically what a PSMC analysis does sequentially for an entire genome to get a much larger sample size.

1 Like

I disagree. The methods used are capable of detecting bottlenecks - that’s why they are used.

1 Like

A couple of comments, since I don’t want to go very far down this particular rabbit hole. First, there is no reason to be calculating a summary statistic like Tajima’s D here when we’re looking at the actual distribution of allele frequencies; the distribution contains more information. Second, Wakeley’s paper was looking at a very different scenario than you’re proposing. He considered a structured population that had reached a stationary state. You are proposing a population that is very far from stationary, in which allele frequencies are recovering from a tight bottleneck.

Very few alleles will be fixed even in a (reasonably sized) subpopulation in this timeframe. The dominant effect will be to stall the drifting alleles at a frequency limited by the deme size. This is easy to simulate, though, so I did. I simulated a constant-sized population (size = 10,000) for 8000 generations (200,000 years), and then simulated a subpopulation of size 1000 for the same period. To compare the two, I scaled down the frequency bins for the subpop by 10 and scaled up the counts by 100 (10x for ten subpopulations in the big population, another 10x since each frequency bin is now 1/10th the width of my usual bin). Here is the result:


It’s the same number of variants (slightly less, actually), but bunched at lower frequency. Whatever subpopulation size you pick, the result will be to shift variants to lower frequencies, worsening the fit with real data.

I very much doubt you could get both enough variants and enough drift in 250,000 years. As I said above when I showed the 500,000 year scenario, multiple bottlenecks on that timescale might do the trick.[quote=“RichardBuggs, post:79, topic:37039”]
I was wondering why you could not include as much variation from the pre-bottleneck population as was needed to cause the simulations to fit the data in the 20-50% range? Why only focus on making them match at the 60-70% frequency range? I didn’t quite follow the logic here, with your brief description.
[/quote]
I chose 60-70% because there is very little contribution from post-bottleneck mutation there, so the comparison with data is clean; everything in that range comes from the distribution I needed to scale.

2 Likes

It would decrease the mutation rate per generation. Most mutations occur in men, as their germline cells replicate many more times than in females, and the mutation rate increases with age.[quote=“Marty, post:83, topic:37039”]
Secondly, if the mutation rate stayed “the same” and the generation time reduced, would that pull the X axis in proportionally? That is, would a 40% reduction in generation times produce a 40% reduction in the X axis?
[/quote]
Just shortening the generation time would scale the x axis, yes. The simulations are done in terms of generations, and the years are just calculated at the end by multiplying.

A short generation time is implausible, though. In modern hunter-gatherer societies, puberty occurs around 15 and first birth about 2 years later, with late weaning and a relatively long time between births. This is the data that makes people think the typical generation time for our recent ancestors was 28 or 29 years. In chimpanzees, the mean generation time is ~25 years. [quote=“Marty, post:83, topic:37039”]
George - I can’t find it now, but I recall Steve pointing out that the graphs assume fairly constant mutation rates, which is a fair place to try. But, for example, if our solar system goes through regions of space that are higher or lower in various types of radiation or other toxicity, they could vary some. I think the point of @RichardBuggs is not to just cast doubt, but to draw legitimate boundaries around what can be said with greater or lesser confidence. Scientists don’t want to be guilty of assumptions that could be proven false, but if those assumptions are explicit because there is no better option at this time, well, better to identify those areas. I appreciate Steve stating it up front, and I think it appropriate to recognize it as a fair working assumption which could perhaps be open to revision.
[/quote]
Such possibilities are often quite bounded by other data, though. Radiation contributes a very small fraction of current mutations, with cosmic radiation being a small fraction of that. Very large increases in radiation would increase the mutation rate, but radiation-induced mutations look very different from the genetic variation we actually see in humans.[quote=“Marty, post:83, topic:37039”]
Now for the scientists, is there any way to assess variability in historical mutation rates? And for Steve, if the mutation rate halved, for example, would that double the X axis of your graphs?
[/quote]
The answer to the 2nd question is yes. The answer to the first is yes, but not easily. The mutation rate really isn’t likely to change on short timescales, though. Most mutations are the result of internal biochemical processes, processes which are similar in similar species. As I pointed out earlier, different processes produce different mutations, and a change to one process (like radiation-induced mutation) will be evident.

Broad view:
In general, it seems reasonable to conclude that the creature that’s quacking, waddling across my lawn, and looking like a duck is in fact a duck. While it is in principle possible that, thanks to a set of circumstances I haven’t thought of yet, the creature is actually a turtle, at some point it becomes the responsibility of someone arguing, ‘Hey, maybe it’s a turtle’, to provide some kind of evidence to that effect.

The bottom line is that human genetic variation looks exactly like the result of a long-term largish population plus recent expansion. It does not look at all like the result of a tight bottleneck. Simple pop gen reasoning says that such a bottleneck would leave easily detectable traces for hundreds of thousands of years. Simulations bear this out. Until someone can demonstrate that there actually exists a plausible model that would hide the bottleneck in that time frame, I don’t see much point in pursuing the question further.

BTW, my immediate plan is to take at least a few days off from contentious issues on the internet, so I intend not to comment any more for a while. It’s going to be nothing but Netflix and puppy videos for me.

8 Likes

Enjoy your rest, Steve. I find real-life puppy therapy (we have a golden retriever) even better.

But are you sure it’s not a crocoduck?

Sorry, couldn’t help myself. :slight_smile:

4 Likes

They do not claim to do the latter in the paper, so you can hardly cite the paper to demonstrate it. For the non-Africans they can make a comparison with the Africans, but they have no comparator population for the Africans. Therefore I don’t know how they could make a similar exclusion using their data.[quote=“DennisVenema, post:87, topic:37039”]
Have a look at Table 5, which shows their data for the distribution of TMRCA values. This is the data and analysis they are basing their conclusions on. Bottlenecks increase the probability of coalescence (this is also how PSMC methods work). We see a distribution of TCMRA values for the alleles in the study. This is basically what a PSMC analysis does sequentially for an entire genome to get a much larger sample size.
[/quote]
Which conclusions are you referring to? The paper’s calculations of effective population size are in Table 4. Table 5 uses a range of possible ancestral population sizes to estimate times to most recent common ancestors (TMRCA). They use their results shown in Table 4 to pick what they consider to be the most appropriate effective population size for the most reliable estimate of TMRCA in Table 5. As I said earlier their estimates of effective population size are based on present day numbers of segregating sites in the sample sequences, and estimates of mutation rate. They don’t use their TMRCA calculations to try to estimate effective population size. They don’t make any inferences about bottlenecks or not from the TMRCA results[quote=“DennisVenema, post:88, topic:37039”]
I disagree. The methods used are capable of detecting bottlenecks - that’s why they are used.
[/quote]
The method this paper is using assumes fairly constant population sizes. Perhaps you are mistaken about what method is actually being used?

Hi Steve, thanks for your reply.

I was simply using Tajima’s D to try to communicate clearly and concisely about what we would expect to see, and to facilitate interaction with the literature. I think that is reasonable – normal even.[quote=“glipsnort, post:89, topic:37039”]
Second, Wakeley’s paper was looking at a very different scenario than you’re proposing. He considered a structured population that had reached a stationary state. You are proposing a population that is very far from stationary, in which allele frequencies are recovering from a tight bottleneck.
[/quote]
But it does make the point that population structure would boost the number of intermediate frequency alleles. The bottleneck hypothesis could have a scenario of rapid population expansion followed by a period with a structured population in a fairly stationary state.

I do think though that sub-division arising a few generations after a bottleneck, accompanied by rapid expansion of sub-populations, would result in higher numbers of alleles at intermediate frequencies in the meta-population as a whole than would be expected in a single expanding panmictic population.

Thanks for doing this! Hmm, this is interesting. Why is there a sudden drop at 0.05 (corresponding to a frequency of 0.5 in the single deme, I guess) for the demes?

Your model certainly suggests that if we only consider new mutations, and no gene flow among demes, constant deme sizes, reasonably large demes, and no convergent mutations among populations, then the effect of population subdivision might not be strong enough to explain the high numbers of intermediate frequency alleles.

I think the number of intermediate frequency alleles would be higher if we included sites that had ancestral polymorphism when the population became subdivided. Then some of the sub-populations would share the same alleles at fixation, giving them higher frequencies in the meta-population as a whole.

But what if the variants were supplied from the ancestral population?

Successive “Adam and Eves” rather than just one couple? Interesting idea.

OK, I see. But in your 100kya_component.jpg figure there is very little contribution from post-bottleneck mutation above 20%. Why couldn’t you make the model fit right down to there with more pre-bottleneck variation? What would scenarios look like if you did have a very large and very variable population before the bottleneck?

I really appreciate the time you have put into this, and I do hope you will continue this discussion after your break. I realise I am acting as a defense attorney in a case that seems to you to be a bit hopeless, but I think it is good for us to work this all through in detail as there are so many people out there for whom this is an important issue for their faith - who have very heartfelt beliefs about this - and I think we owe it to them to go through this throughly. Personally, as I said at the end of my blog, I am open to this debate going either way. My faith will be unaffected whichever way I end up concluding after this discussion. But for some good people that may not be the case and I think everyone participating meaningfully in this debate is doing them a great service.

1 Like

@glipsnort:

Nice demonstration of the math used in simulations!

@glipsnort

That is one beautiful paragraph! And you have the sample math to demonstrate it.

The various contenders over this position always seem to be arguing that the difference in outcome
between:

1 Mating Pair 6000 years ago

and

Thousands of Mating Pairs 100,000 years ago

is a difficult thing to sort through, and that just by using slightly different rates and proportions,
either scenario could easily produce the very same results.

If we “called” them on this implied premise from the “get go”, maybe we could reduce the “heat”
of the discussion, and shine light where it needs to be shined.

Part of this trend comes from these parties “daring” the scientists to run different scenarios.
I think it’s high time that I.D. proponents (if that is who they represent) start learning a little science
and run their own scenarios. Then they can spend the endless hours trying to tweak the data to
make it do what they think it should.

Once they have accomplished that (if ever) … we will all learn exactly what kind of demographic
hand-stands are required to make 1 mating pair (who are really really motivated!) produce the
results of thousands of mating pairs who just don’t seem to be trying very much!

Next time they dare, let’s hand them their own kit, and tell them to do it themselves.

Hi Richard,

I guess I’m asking you to look at the TMRCA data there and think about your hypothesis (a bottleneck to two in the last few hundred thousand years).

If you don’t think that paper is relevant to your hypothesis, I’m willing to concede the point arguendo and direct you to the other papers in the set I supplied that do more explicitly test your particular hypothesis.

I think Steve’s point is also a good one: you have to realize at some point that you’re arguing against the conclusions of an entire field of research. If you want to challenge that consensus - and in science, challenge is always welcome, of course - you’ll have to provide something in the way of evidence to support your challenge. To date, your argument has basically been that your hypothesis has not been tested. It has (as I’ve pointed out in the papers above). Steve has also nicely shown you why the allele frequency spectrum just cannot be squared with your hypothesis. To move on from here, I’d like to see some modelling from you (or someone else) that supports what you are positing.

1 Like

You’re positing a bottleneck to two individuals. How is enough variation going to make it through the bottleneck?

I know that in the past Dr. Buggs has been reported as an ID advocate, but I don’t know if he would claim that identity for himself. Years ago, Richard published a piece claiming that human and chimpanzee genome similarity could be as low as 70%, and this was touted by the Discovery Institute for some time (it was a favourite of Casey Luskin for a while). I’m not sure where Richard stands on ID and or human/chimp common ancestry, or if he might be open to discussing his views.

1 Like

I apologize for jumping in like this, I only have one question. Does your understanding of the data support the idea of an Adam and Eve who had no ancestors at all (neither human nor pre-human), as the universal progenitors of every human who has ever lived, with no humans descending from any parallel humans or pre-humans, approximately 6,000 years ago? Thanks.

2 Likes

Please could you explain to me why you think that data excludes a bottleneck of two at some point in the human lineage? I am still puzzled as to why you pointed me to that paper in the first place.

That study identified 75 variants in this region that have a minimum coalescence time of over 700,000 years. The mode is 1.2 million years, and 700,000 is the lower bound of the 95% confidence interval for the combined sample.

So, how did all of that variation survive a bottleneck to two? It can’t. So, how did all of that variation arise after a proposed bottleneck to two? Through new mutations. How long would that take? Even with a steady-state population of around 10,000, about 1.2 million years.

Even the non-african population under study, with its smaller Ne, still has the lower bound at over 300,000 years ago.

Also notice that as they model lower Ne values for each population, that pushes the lower bound of the TMRCA values further back in time. In other words, in a smaller population, more time is needed to account for the variation we see. Smaller populations require more time for mutations to produce new variants because each generation has less “trials”. If you shift that value from 10,000 to a starting population of 2, what do you think might happen?

1 Like

Hi Dennis, I think you may be misunderstanding what Steve is modelling. All the variants in his model are SNPs with two alleles. As you have described in your blog, two alleles can get through a bottleneck. What were you thinking that Steve was modelling?

I understood what Steve was modelling. I thought you were asking for more alleles to come through the bottleneck as a way to adjust allele frequencies. If you weren’t, my bad.

When you do get back, please accept my appreciation for the detailed answers! Very helpful.

2 Likes

I think that the vast majority of biologists, and especially geneticists, would consider the case to be not merely hopeless but also ridiculous. (That’s what I think.) There is only one reason to even consider a recent and dramatic bottleneck in human history, and that is a particular (and dubious) reading of some ancient writings on unrelated topics. Indeed, the rest of your response to Steve effectively concedes this. That is why this conversation is happening on the forum of a religious organization, and not from the platform of a molecular evolution conference or in the pages of a journal.

In my view, it was reasonable to ask Dennis to clarify or even correct his pattern of citation in support of his claim that a recent bottleneck down to two individuals is as credible as geocentrism. Again, just my opinion, but you have not made a credible case against Dennis’ claims about an extreme bottleneck—in fact, that proposal is ridiculous outside a window of hundreds of thousands of years. Perhaps you have raised a credible technical objection to the ways that Dennis justified his claim in a book written for laypeople.

You and others on this forum believe it important to talk through the science for the benefit of others who share your view of the value and credibility of those ancient writings. I agree with that, simply because these religious views have import (regrettable, IMO) in our civilization. But I strongly disagree with any claim that the notion of an extreme bottleneck in recent human history is a credible proposition, worthy of any serious consideration in light of what we know about human biology and especially about human genetics. And I strongly disagree with the assertion that the existence of such a bottleneck is an untested hypothesis.

Steve was kind enough to do some nice simulations, but there was no need to do that, at least not scientifically. In fact, I would argue that there is some slight danger in what he did. The simulations and the extensive technical discussion that followed may have given some readers the impression that extreme recent bottlenecks are an open question and that only through extensive technical discussion could some experts carefully rule this out. That picture is unacceptably (to me) misleading. There was no such open question. There is no debate, or dispute, or uncertainty about a recent bottleneck, which is less nonsensical than geocentrism, I’ll grant, but only quantitatively.

3 Likes