Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)

tallen_1 · January 15, 2018, 8:32pm

Phil,

I was using the term “missionary” metaphorically of course ;). That said, I think the YEC community does an amazing amount of harm. They spread a distrust of science and established scientific fact, and a pernicious tribalism that glorifies everything within their echo chamber while seeding suspicion and dismissal without. So, in the metaphorical sense, I’d value all the missionaries we can send their way.

Swamidass · January 15, 2018, 10:39pm

Yes, you do. There are enough challenges that science brings to theology that we do not need to force them where they do not exist. Certainly some Christians see need for Adam. Many do. Let’s just be honest about the science with them and see how it shakes out from there. Perhaps the will end up just a different type of Christian than us in the end. That is okay, as long as they are in Christ.

I think mediator is a better term than missionary. But I am aiming to change their views so that we all might adopt a more Christ centered faith. With Jesus as the foundation, we adopt a more orthodox faith and these questions are not so concerning.

My biggest problem with scientific YEC creationism is not its scientific problems but its theological emphases. Our faith is not grounded in a specific understanding of creation and the human effort to study nature (creation science or evolutionary science). Rather, our faith is grounded in Jesus, the one who rose from the dead. By Him, we can find a confident faith, unthreatened by evolution.

As for Adam? There are reasons why the common solutions offered by TE / EC Christians are unsatisfying to many Christians. My point is just that we should be empathetic to the questions of the Church, and not force conflict where it need not be. Instead, lets be honest about how the theological values we bring to the table could work in light of mainstream science.

Regarding Adam and Eve as homologous clones.

Yes of course some people thing that. However. that is not what @agauger and @RichardBuggs has argued, nor is it what John Sanford or most knowledgeable YECs I know have argued. If one model of Adam is falsified by the data, does not mean the other is falsified too. If we are going to make strong claims about what the evidence rules out, it behooves us to take seriously the all the alternate hypotheses we can. To the point, for at least 5 years now, most YECs in this area have been trying to work out a model of created diversity. We cannot claim to rule this model out, unless we actually examine it in light of the evidence.

Of course, as should be clear from my prior posts, baring miracles this seems to be ruled out in the recent past. However, in the more distant past, maybe not. It seem false to say we have certainty there was no single couple bottleneck over the last several million years (as has been put forward). The evidence does not seem substantiate that claim. except perhaps in trans-species variation (but we haven’t even discussed that yet).

jpm · January 16, 2018, 5:33pm

15 posts were split to a new topic: Reaching out through Adam

jpm · January 16, 2018, 5:38pm

Replies moved to new post to clean things up a bit, tough to find a good cut off so may have to modify a bit but bear with me: “Reaching out through Adam”

RichardBuggs · January 16, 2018, 9:43pm

Hi Joshua,

This is just going to be a rather brief holding response as I have only got part way through my first read of the ARGweaver paper, and a series of mini-crises in my lab are taking up much of my time now, so it may be a while before I can give you a fully considered response. I don’t want to leave you waiting for too long, so here is a quick reply, with all the shortcomings that this necessitates.

I agree with you that the ARGweaver results, given the assumptions and simplifications behind its analyses, does appear, on your further analyses of its graphs, to give a reasonable bound of a bottleneck of 420 kya +/- 100 Kya. I don’t say this as someone who has worked through all of the analyses for a second time: I just say this as what you have described seems to me to be reasonable. To be perfectly honest, I am quite surprised at how low this figure is. If you had asked me to guess beforehand I would have probably suggested a higher figure.

Having said that, on my rather shallow reading of the work so far, I would be slightly cautious, in that just having 4 lineages left in a population does not mean that those 4 lineages are all found in just two individuals, as I think you have already pointed out. Demonstrating that only four alleles are left in a population is a necessary pre-requisite of a bottleneck of two, but is not in itself evidence that a bottleneck of two actually occurred. For the four alleles to coalesce to the point of being in the same human bodies may take quite some while and could add a bit to to the timing (This depends on effective population size, of course, as has been frequently noted in the discussion above). I think I would therefore have more confidence in your lower bound than your upper bound.

My point "I do think that the coalescent models used in a test of the bottleneck hypothesis would need to include the effective population size decreasing down to two as we go back in time. " Was a reiteration of a point that I have made several times before in the discussion above when discussing coalescent analyses in the Zhao et al paper.

In the ARGweaver paper, in a footnote to Table 1 the author’s write “Model allows for a separate Ni for each time interval l but all analyses in this paper assume a constant N across time intervals.” It sounds to me as if they use a constant Ne. I have to admit that I find the paper rather confusing on the point of effective population sizes, but you have spent longer than I have working out exactly what they did, so I look to you for enlightenment.

[By the way, I was reflecting on my point “I do think that the coalescent models used in a test of the bottleneck hypothesis would need to include the effective population size decreasing down to two as we go back in time” after I had posted it, and I have a caveat about this. No method of estimating Ne based on genetic diversity (that I am aware of) is capable to identifying a short sharp bottleneck of two as an Ne of two. That is because every method (that I know of) would need the population size to remain constant for at least a few generations before it could estimate an Ne of 2. Thus, I think we can safely say that when effective population size is defined by an equation based on genetic data, a single generation of census size two does not have an Ne of 2, but of a higher number (exactly what I don’t know - I guess it would depend on the size of the pre-population bottleneck and the rate of population expansion afterwards). I have - in effect - made this point before, but have never quite formulated it in my mind in these terms, so thought it might be worth sharing for discussion/correction.]

I think the most important take home message for me from your ARGweaver analyses is that (as far as I can see) you have shown nicely that genome-wide allele counts do not provide evidence that a bottleneck of two has not happened in the human lineage. That is a real step forward in our understanding of this area. Thank you!

Swamidass · January 16, 2018, 11:20pm

No problem. Believe it or not, I have my own fires I’m putting out in my lab this week. I also have 4 public events I’m doing next week, including one with Hugh Ross. So a lot is going on here too.

Take your time in responding, but there is a lot you’ve written here I want to echo.

Thanks for offering your public thoughts on that too, as hopefully that should put some criticism to rest.

I agree. This was surprising for me too. I would have guessed a different number. It is still important to caveat that this is just a subset of the data, and subject to revision to a more ancient data with more evidence. It is hard, however, to imagine it being revised more recent than 300 kya.

In particular, I would point to two pieces of evidence that are not considered here:

Genetic evidence of interbreeding with Neanderthals/Densiovans seems to put a bound on a single couple bottleneck back to the Homo sapien common ancestor with them, perhaps 400 kya to 700 kya ago. This analysis of TMR4A used the median, and would underestimate that date because interbreeding with them only affects the minority of the gnome (it seems).
Interspecies variation will always put an asterix on these results. Though on re-looking at the literature, it is very sparse. In contrast with population genetics estimates of population size, there are only a handful of papers that address this. I have not been able to find definitive evidence of >4 alleles at a single locus with interspecies counterparts. However, that maybe just because no one has looked at enough of the Chimp data (which very sparse). Nonetheless, I’m less convinced now than when I started that this will be (at least in current form) definitive evidence against a couple bottleneck. Still, we have not taken it into account here.

The phrasing here is difficult, but it sounds like you are saying you have more confidence in bound on single-couple bottlenecks of 520 kya than 320 kya. I’d agree with you here too.

That is correct, and I have been working out the math on this, and planning some experiments. It looks like the key variables for Ne is (1) how many generations are at a single couple (just one in our case), (2) the number of offspring they have and degree of exponential growth in the few subsequent generations (which we can assume here is very high), and (3) how distant this is in the past (as the averaging window for Ne increases in the past). Keeping mind that #2 is essential a free parameter, #1 and #3 are such that the farther back we go in time the much less a single couple generation affects Ne. So, therefore, a single couple bottleneck can be entirely consistent with a very high Ne in the distant past (say at 500 kya)

Once again, this is just an informal description, but there are some interesting details in the math. Sufficiently interesting, I’m nearly convinced its worth turning into a publication in its own right. Interesting stuff.

This is an important caveat to emphasize.

This is not evidence for a single couple bottleneck but evidence that population genetics will be unable to detect a bottleneck in the distant past. It is an argument that such a bottleneck would be hidden in the shadows, and not clearly seen in the data.

Moreover, median TMR4A, as you suggest, may be too generous a cutoff. My instinct is that we should probably use a CDF cutoff of about 70 to 80% instead of 50%, which would put TMR4A at about 525 kya. Though, I cannot be certain on instinct. The right way forward is to determine this cutoff from simulations, which I am nearly convinced are worth the effort to embark on. There seems to be good theoretical reason to think that detection power will correspond relatively tightly with some cutoff on the TMR4A CDF. Once I get around to doing those experiments, we’ll have a much better bound.

They are not clear in the paper. But the code itself is clear. The only way they seem to use Ne is to set the prior, and the prior (if anything) pulls TMR4A downwards. However, the influence of the prior on the joint likelihood (just 5%) is very low, so I’m not really concerned about this. There is no plausible reason I see to doubt the results of this study because of their use of the prior. As I have explained, the prior thinks mediant TMR4A is at about 200 kya, but we compute it at about 420 kya. So the data is pulling the estimate upwards, and there is sufficient data to totally overwhelm the prior.

To be clear, I agree that “genome-wide allele counts” alone are not very good evidence. Moreover, their overall diversity do not provide evidence against a single couple bottleneck after about 400 kya_ (subject to revision). They do, however, provide evidence against a more recent bottleneck, as you have already affirmed.

As many people have noted, for most people, this is a fairly disturbing challenge to theology. Perhaps some will find solace in an ancient Adam that was not Homo sapien. At the moment, that seems to be an outlier position, though perhaps it will grow, especially as it seems we are beginning to come to a consensus here.

RichardBuggs · January 17, 2018, 9:56pm

Hi Joshua,

I am glad we have reached such a level of agreement.

Regarding ARGweaver:

I wonder if the code itself is pointing in a slightly different direction to the paper. Their footnote under table 1 suggest that the code does allow for separate Ne estimates at each time interval, but for all the analyses in the paper itself they assumed that Ne did not vary among time intervals. Perhaps they did this to speed up the analysis as they had such a large dataset? I still have more reading to do of this paper.

I am not sure how much different this would make anyway, as (as we both agree) any method they used to estimate Ne would likely not detect a bottleneck anyway, if one had in fact occurred.

That is brilliant. Do keep me updated!

Interestingly, I believe that this has been the position of @agauger all along.

RichardBuggs · January 20, 2018, 10:14pm

Hi all,

I have been doing a bit more reading about the theoretical background of some of the methods we have been discussing here. I am not a mathematician, so much of this is outside of my area of expertise. However, I have come across three papers that suggest that seem to suggest that site frequency spectra (as presented earlier in this discussion) have severe limitations as a source of evidence about past population sizes. The second of these papers specifically examines scenarios of a bottleneck followed by exponential population growth.

Simon Myers, Charles Fefferman, Nick Patterson Can one learn history from the allelic spectrum? Theoretical Population Biology, Volume 73, Issue 3, 2008, pp. 342-348
https://www.sciencedirect.com/science/article/pii/S0040580908000038
Abstract: It is well known that the neutral allelic frequency spectrum of a population is affected by the history of population size. A number of authors have used this fact to infer history given observed allele frequency data. We ask whether perfect information concerning the spectrum allows precise recovery of the history, and with an explicit example show that the answer is in the negative. This implies some limitations on how informative allelic spectra can be.

Terhorst, Jonathan, and Yun S. Song. Fundamental limits on the accuracy of demographic inference based on the sample frequency spectrum. Proceedings of the National Academy of Sciences 112.25 (2015): 7677-7682.
http://www.pnas.org/content/112/25/7677.short
Abstract: The sample frequency spectrum (SFS) of DNA sequences from a collection of individuals is a summary statistic that is commonly used for parametric inference in population genetics. Despite the popularity of SFS-based inference methods, little is currently known about the information theoretic limit on the estimation accuracy as a function of sample size. Here, we show that using the SFS to estimate the size history of a population has a minimax error of at least O(1/log s), where s is the number of independent segregating sites used in the analysis. This rate is exponentially worse than known convergence rates for many classical estimation problems in statistics. Another surprising aspect of our theoretical bound is that it does not depend on the dimension of the SFS, which is related to the number of sampled individuals. This means that, for a fixed number s of segregating sites considered, using more individuals does not help to reduce the minimax error bound. Our result pertains to populations that have experienced a bottleneck, and we argue that it can be expected to apply to many populations in nature.

Baharian, Soheil, and Simon Gravel. “On the decidability of population size histories from finite allele frequency spectra.” Theoretical population biology (2018).
https://www.sciencedirect.com/science/article/pii/S004058091730148X
Abstract: Understanding the historical events that shaped current genomic diversity has applications in historical, biological, and medical research. However, the amount of historical information that can be inferred from genetic data is finite, which leads to an identifiability problem. For example, different historical processes can lead to identical distribution of allele frequencies. This identifiability issue casts a shadow of uncertainty over the results of any study which uses the frequency spectrum to infer past demography. It has been argued that imposing mild ‘reasonableness’ constraints on demographic histories can enable unique reconstruction, at least in an idealized setting where the length of the genome is nearly infinite. Here, we discuss this problem for finite sample size and genome length. Using the diffusion approximation, we obtain bounds on likelihood differences between similar demographic histories, and use them to construct pairs of very different reasonable histories that produce almost-identical frequency distributions. The finite-genome problem therefore remains poorly determined even among reasonable histories, where fits to few-parameter models produce narrow parameter confidence intervals, large uncertainties lurk hidden by model assumption."

So I think I should add these to the criticism I made earlier of this approach to @glipsnort here:

RichardBuggs:

However, I have to admit that although I think that your arguments from allele frequency spectra could potentially make a good test of the Adam and Eve bottleneck hypothesis, I would need to see this worked through in considerably more detail before I was fully persuaded that it was an adequate test. I have been reading a bit more widely about site frequency spectra and the factors that can affect them in a few spare hours. In particular I found these recent papers helpful:

Harpak, A., Bhaskar, A., & Pritchard, J. K. (2016). Mutation Rate Variation is a Primary Determinant of the Distribution of Allele Frequencies in Humans. PLoS genetics, 12(12), e1006489.

Ferretti, L., Ledda, A., Wiehe, T., Achaz, G., & Ramos-Onsins, S. E. (2017). Decomposing the site frequency spectrum: the impact of tree topology on neutrality tests. Genetics, 207(1), 229-240.

Koch, E., & Novembre, J. (2017). A Temporal Perspective on the Interplay of Demography and Selection on Deleterious Variation in Humans. G3: Genes, Genomes, Genetics, 7(3), 1027-1037.

Gao, F., & Keinan, A. (2016). Inference of super-exponential human population growth via efficient computation of the site frequency spectrum for generalized models. Genetics, 202(1), 235-245.

These papers have strengthened my view that a wide range of complex demographic, phylogenetic, selective and mutational processes, together with sampling strategies, can influence site frequency spectra, and that I therefore cannot conclude from the models that you have run that a bottleneck of two in the history of the human lineage is not possible. To be convinced I would need to see more complex models run that try to incorporate these factors.

In addition, I came across this paper which @DennisVenema may find interesting as he writes his blog about the PSMC method

Kim, J., Mossel, E., Rácz, M. Z., & Ross, N. (2015). Can one hear the shape of a population history?. Theoretical population biology, 100, 26-38.
http://www.sciencedirect.com/science/article/pii/S0040580914000987?via%3Dihub

I have also been reading up more on ARGweaver and intend to post again on this soon @Swamidass .

Swamidass · January 21, 2018, 1:07am

TedDavis:

Dennis,

I appreciate the great clarity of your reply to Dr. Buggs–not that an absence of clarity has ever been something I would associate with your work.

I hope that Discovery also tweets your reply to Dr. Buggs. They owe it to fair discourse to do exactly that much, since they are responsible for bringing Buggs’ concerns out of the academic tent and into their own, much larger tent. Otherwise, they might be skirting with the same danger that Buggs is worried about: that “of alienating Christians from science on the basis of a wrong interpretation of the current literature.”

I resonate with that concern. That’s one of the main reasons why I decided to devote my professional life to helping Christians (and others too) understand the history better. Thank you for helping us understand the science better.

Hello @TedDavis, I hope you are well my friend. Things have come a long substantially since you first posted on this thread, back about 2 months ago. I summarized the scientific highlights of this conversation here.

Surprisingly, at lease to me, @RichardBuggs was on to something. Our certainty about a bottleneck in the distant past (e.g. before 500 kya) may not be as high as we imagined. As I write here…

And the implications for theology…

Now, @TedDavis, I agree with you that a recent genealogical Adam (A Genealogical Rapprochement on Adam?) is probably more significant in the long run that an ancient single-couple bottleneck. This, nonetheless, is a surprising finding. Assuming, of course, that it pans out. We are still early in the game, and might find a mistake. This reminds, many ways, of a similar point we were almost exactly 12 months ago on the genealogical Adam work.

Nonetheless, this really could pan out, and some Christians mich join @agauger in taking this view. At the very least, much of the claims on the science have been overstated if it takes this much effort to disprove an ancient bottleneck, and we have yet to do so.

I’m curious, therefore, your thoughts on a few levels as a historian many of us trust in this conversation:

How do you think an ancient bottleneck couple will influence the conversation?
How do you think a recent genealogical Adam will influence the conversation?
If TE / EC’s have overstated or been overconfident on the evidence, how should this correction rework our voice?
Do you know any good historical analogies to these two corrections, if they end up being correct.
I am planning for the ASA Workshop in June in Boston on “Reworking the Science of Adam.” What do you think are the key things for the ASA community to know about these exchanges?

Thanks for your thoughtfulness here. I’m wondering how your perspective could guide us here. Many of us are doing what we can to serve the Church, and the science of Adam appears to be a place where the ball was fumbled.

Swamidass · January 21, 2018, 1:23am

Allele frequency spectrums (AFS) do not give a solid view of ancient bottlenecks, but they do of recent population structure. Ironically, very recent bottlenecks are not well ascertained by MSMC and PSMC and LD-Blocks, but they are clear in AFS. This is covered pretty well here:

So yes, in the ancient past you cannot really infer much from AFS, but that has never been @glipsnort’s claim. His claims are consistent with what I showed with argweaver.

@glipsnort has not made any claims of heliocentric certainty.
He would agree that past about 500 kya, we do not expect allele frequency spectrums to detect a bottleneck of a single couple. That is where he places a tentative cutoff. So his results are essentially the same as argweaver, though the evidence form argweaver is much stronger.
His original reason for delving into AFS was to respond to some young earth creationists that claimed the AFS was inconsistent with a large ancient population and required a single couple origin just 6,000 years ago: (Can someone explain like I'm 5 yo, what's wrong with this refutation of Biologos?).
His response to Ola Hossjer (colleague of @agauger) has been very well measured, and entirely correct. (Glipsnort responds to a critical article) Notice that he does not prese a case against ancient bottlenecks, but only for common ancestry with great apes and huamns, and against a recent bottleneck. Both those claims are very well supported by the evidence, and he produces analysis of his own all the time.

I know you are not attacking @glipsnort personally, or even leveling an unfair scientific critique. I do think, however, it is important to clarify that he has been a measured and careful voice. In my opinion, he has not drawn incorrect conclusions from the AFS work, nor has he overstated his certainty of those results.

Swamidass · January 21, 2018, 10:28am

A couple technical updates:

ArgWeaver Does Not Assume Large Population. The computed TMR4A is biased downwards, not upwards, by the prior.

The Correct Mutation Rate. ArgWeager is using an experimentally confirmed mutation rate.

Swamidass · January 21, 2018, 11:06am

And, more importantly, this improvement of the estimate…

Correctly Weighting Coalescents. An improve esitmate of TMRCA is about 500 kya.

I finally got around to correcting this part of the code, and recomputing the TMR4A. Here is what we arrive at, a TMR4A of 495 kya, nearly 500 kya. This is a better estimate.

https://discourse-cdn-sjc2.com/standard9/uploads/peacefulscience/original/1X/94c9420257f170b3e5f847aff3363ba3451568a2.png

Jay313 · January 21, 2018, 6:48pm

An actual H. erectus (or heidelbergensis) named “Adam” might have been capable of naming “Eve” and the animals, but not much more. Of that much, we are certain …

RichardBuggs · January 22, 2018, 9:08pm

Hi Joshua,

I’m just catching up with this dialogue on a train. I should be marking essays, but will just take a moment to quickly repond to a couple of points.

Thanks, I had not seen that exchange before between Ola Hossjer and @glipsnort. Very interesting. However, it does pre-date the current discussion, and I am keen to hear Steve’s own response to the papers I have referenced on the AFS method. I agree with you that he has been a measured and careful voice in this discussion and I have great respect for his expertise.

But would you agree than in their analyses reported in the paper they have assumed a constant effective population size? If not, how do you understand the footnote to the table that I referenced above.

My train has just arrived at King’s Cross Station - sorry to have sign off. I greatly appreciate your work on this thread, and the honesty and open-mindedness that you have shown.

gbrooks9 · January 23, 2018, 4:51pm

@Swamidass:

Once you go back beyond 6,000 years, and especially 10,000 years, what’s the point of trying to prove a bottleneck “older than 10,000 years, and hidden in a shadow”?

If it creates a motivation for YEC’s to preserve their position in an Old Earth Scenario… good… .let them work for that.

Our job has been to show that the “Young Earth” part of any Christian’s world view is untenable. The more YEC’s work to legitimize an Old Earth Scenario, the better it will be for everyone!

glipsnort · January 23, 2018, 9:16pm

I hope to get back to this thread within a few days.

RichardBuggs · January 25, 2018, 8:47pm

Hi Joshua @Swamidass
I am taking a look at the ARGweaver paper more throughly. It is very clear that the ratio of mutation rate to recombination rate is critical to the accuracy of the method, as the authors comment in the paper, and as several of their supplementary figures (S4-S8) show. When the mutation rate is high relative to the recombination rate, they have much more power than when it is low. However, I am struggling to see what recombination rate they used or estimated when analysing the 54 human genome sequences. Do you know what recombination rate was used? I notice that on page 8 they comment that ARGweaver has “a slight tendency to underestimate the number of recombinations, particularly at low values of mu/rho” and also that they say that other sources give a low value of mu/rho for human populations. This suggests that in their analysis of the 54 human genomes they may well have estimated a lower rate of recombination than the correct rate. However, I can’t find the figure. Is this something that you have looked at, please? If they have underestimated the recombination rate, how do you think that would affect the TMR4A?
best wishes
Richard

RichardBuggs · January 25, 2018, 8:48pm

Steve, that’s great news. I would also be really glad to hear your view on Joshua’s analyses of the ARGWeaver data, if you have time.

Swamidass · January 27, 2018, 1:21am

@RichardBuggs please exuse the delay in responding to you. I’d normally put a high priority on it, but my father unexpected passed away this last Saturday. I will return with haste, but have more pressing matters at the moment. Peace.

gbrooks9 · January 29, 2018, 2:35am

@Swamidass,

My deepest sadness to hear this news. Prayers for you and your family! George Brooks