Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)


(Christy Hemphill) #599

Is anyone taking bets about whether or not Evolution News and Views will promptly publish an article about this part of the thread? :face_with_raised_eyebrow:


(T J Runyon) #600

This whole thread has been quite a learning experience. While it was frustrating at times because I’m at the beginning of my genetics education I still feel like I benefited from this. But areas I do feel more than competent to discuss are paleoanthropology and archaeology. These areas are where the vast majority of my studies have taken place and where I have had the most training. So im wondering if I take the time to write up something about the viability of a non sapiens Adam and Eve and start a new thread to discuss it will people here be interested in doing so? I don’t want to waste my time getting all of my sources together and writing it up if no one is interested in having that discussion. Thanks!


(Christy Hemphill) #601

Do it. People will discuss it.


(Richard Buggs) #602

In the beginning, when we were first debating this at Skeptical Zone, I noted:

“a creationist (in the conventional sense of the word) would not be concerned about this entire topic as it assumes common ancestry and creationism can have genetic diversity front-loaded into Eve’s ova anyway, thereby avoiding the whole issue of genetic diversity. I suspect many Christians, Jews and Muslims would be interested in the idea of a half million year old ancestral bottleneck of two.”

See: http://theskepticalzone.com/wp/adam-and-eve-still-a-possibility/comment-page-3/#comment-198358


(George Brooks) #603

@Swamidass

I think this is the sword upon which you must, of necessity, fall.

I think the sentence you have here is false… and demonstrably so.

The sentence that I thought you were defending would be more like this:

“There is ZERO evidence that God did not create a unique mating pair (Adam & Eve)
to contribute to the larger human population, anywhere between 10,000 years to 6,000
years ago.”

In order to defend your original statement, you would have to specify something that
forces the discussion into a non-YEC context (being silent on the issue does not make
the sentence more valid):

**"There is ZERO evidence that [hominids] did not begin as a single couple, more **
than 300,000 years ago… Zero evidence. So how do we come to heliocentric certainty
about a claim substantiated by zero evidence?"

If you attempt to replace “hominids” with “humans” (i.e., Homo sapiens), then the lack
of any human fossils from 60,000yr to 300,000yr old strata is the evidence that it didn’t happen.

What are your thoughts on this brief analysis, @Swamidass ?


(Steve Schaffner) #604

Okay, I’m about a month late and the thread has moved on, but I said I would comment about these papers, so here I am.

This is an interesting paper from a theoretical perspective, but the practical implications are really only addressed in later papers. The authors conclude that the allele frequency cannot uniquely identify the actual demographic history, but as noted in the third cited paper, the example they give is not biologically plausible.

Note: I know this paper fairly well, since I shared an office with two of the authors (Simon and Nick) at different times, including while they were writing the paper. The third author is a math heavyweight they had to bring in to get past a sticky bit. Nick had some trouble getting the paper published, not because there was anything wrong with it, but because a reviewer sat on the paper on the paper for well over a year. I was at a mathematical genetics meeting in Durham (where I really did not belong), where Nick gave a talk. He ended by pointing out that this paper had been out for review forever, that the reviewer was probably in the audience, and could he please do his job? He got the reviews back a few weeks later.

This paper does indeed consider a population bottleneck followed by exponential increase in size, and concludes that there are fundamental limits on how accurately such a demography can be reconstructed solely from the site frequency spectrum. It is important to note, however, that the difficulty they demonstrate is in reconstructing demographic events prior to the bottleneck, not the existence of the bottleneck itself. This is clear from their discussion section: “Intuitively, as the severity of the bottleneck increases, the population is increasingly likely to find its most recent common ancestor (MRCA) during that time; farther back in time than the MRCA, no information is conveyed concerning the demographic events experienced by the population.” Similarly: “Additionally, an interesting aspect of our work is that our minimax lower bounds do not depend on the number n of sampled individuals; increasing n is not enough to overcome the information barrier imposed by the presence of a bottleneck.”

This is the most interesting of the papers. It shows that the results of the first paper apply approximately for much more realistic demographies, and that tight bottlenecks can be invisible when just looking at the frequency spectrum. Again, though, there’s an important point to note: the bottleneck they simulate is still, compared to what we’ve been talking about, quite old: 2.5 times the usual 2N generations. Certainly well over a million years ago for humans. I have no trouble believing that demographic signals from that era can be erased. What I have always found implausible is that a signal from less than 0.5 x 2N could be erased, since it leaves insufficient time to accumulate new mutations and get them to high frequency.


(Peaceful Science) #605

@glipsnort, always great to see you.

Help me understand this…

As I understand it, 2N is about 2 million years ago (approx genome wide TMRCA, right?). Your SFS studies show no signal for a bottleneck at 0.5 million years ago, which is 0.25x 2N. Though, as I think about it, your studies used smaller population sizes we expect (e.g. constant 10,000 at all times), so perhaps your 2N is a lot lower than 2 million years?

So where is my math wrong here? If possible…

  1. square this with your prior simulations on SFS.
  2. give extrapolate to the real human data for where that cutoff might be.
  3. tell us if you think there is anything here that conflicts with the TMR4A work.

Thanks.


(George Brooks) #606

@Swamidass,

I asked you a question at the end of this posting (located above) … It’s passingly important for me to
understand some of the discussions you are maintaining. I hope you can get to it sometime
in the near future! :smiley:


(Curtis Henderson) #607

Thanks, Christy, I got a pretty good laugh out of that one!


(Peaceful Science) #608

It is not. If “human” = Homo sapien as explained by Venema, there is no evidence that, as you almost say…

“There is ZERO evidence that God did not create a unique mating pair (Adam & Eve) (perhaps the first Homo sapiens) to contribute to the larger hominid population, when Homo sapiens arise.”

In this case, there is no evidence against it, and Homo sapiens (“human” according to Dennis) begin with a single couple.

@gbrooks9, regarding the the 6kya - 10 kya timeline, that is not really what I have put forward. For a genealogical Adam (not necessarily the first Homo sapien), they could arise anytime before 6 kya, not just in that narrow range.


(Jay Johnson) #609

This doesn’t make any sense. But before I get to that, here are the definitions of hominid and hominin according to the Australian Museum:

Hominid – the group consisting of all modern and extinct Great Apes (that is, modern humans, chimpanzees, gorillas and orang-utans plus all their immediate ancestors).

Hominin – the group consisting of modern humans, extinct human species and all our immediate ancestors (including members of the genera Homo, Australopithecus, Paranthropus and Ardipithecus).

I don’t think that God specially created Adam & Eve to contribute to a larger population that included chimps, gorillas, and orangutans. haha. Sorry. In any case, the only hominin that wasn’t extinct 10,000 years ago was us, formally known as Homo sapiens and colloquially referred to as human beings. A unique mating pair named Adam and Eve would not be the first H. sapiens by any definition, nor would they be the first humans, unless you want to strip even that fig-leaf of dignity away from these “pre-Adamic people” (or whatever moniker they may go by in your scheme).


(Peaceful Science) #610

It is common for @DennisVenema to use “hominid” in the same way I just did. Feel free to take that up with him. Perhaps I could be more clear…

Also, the time line needed to be dropped. This would obviously be well before 10 kya.


(Dennis Venema) #611

I do use hominid, but to refer to the common ancestral population that includes the lineage leading to chimpanzees (or further back). As far as I know, in Adam and the Genome, I use hominin - species more closely related to us than chimps. So, hominid isn’t a usual term I use. If you go far enough back in my writing, you’ll come to a time where I wasn’t consistent with my usage, but that is quite a ways back.


(Dennis Venema) #612

Just keep in mind that “human” in my mind is shorthand for anatomically modern human. Also keep in mind that species designations are a fallible human attempt to draw lines on a continuum.


(Lynn Munter) #613

There are (more or less) anatomically modern human fossils in this age range in Africa (and now Israel, too).


(Jay Johnson) #614

I gathered that. The “quote” you referenced was Swamidass’ words, not mine.


(Dennis Venema) #615

I don’t know why it quoted you - I knew that the word’s were Josh’s, not yours. Strange.


(Richard Buggs) #616

Hi Steve,
Thank you for coming back to this, and for these useful comments, and for the anecdote about the Myers et al paper.

Regarding Terhorst and Song (2015):

I agree, but they also seem to be saying that it is not just a bottleneck that is hard to see through, but also any order-of-magnitude expansion of effective population size. See p7680: “This implies that for populations that have experienced roughly an order-of-magnitude increase in effective population size during their history, accurate estimation of demographic events that occurred before this expansion is difficult using SFS-based methods.” I would imagine that in the recent past the population of Africa has gone through a rapid increase of effective population size of at least an order of magnitude through both population growth and increased mixing among sub-populations. Wouldn’t it be hard to see back beyond this using SFS-based methods? I have to admit I have not mastered the maths in this paper, so I am just having to go on their discussion section.


(Peaceful Science) #617

Okay, here are my current thoughts on trans-species variation. I invite a deep dive in the literature to see if anyone can find a key paper I overlooked. Please prove me wrong if you can…

Trans-species variation. The evidence against an ancient bottleneck in trans-species variation is not as strong as I had thought.


As we have seen, there is a limit how far back the evidence from Human Variation gives us confidence against a single couple bottleneck. Before about 500 kya, it is possible that such a bottleneck, if brief, would be undetected in by current population genetics models. The specific number may be adjusted upwards by further analysis, but it’s a good starting point for now.

However, this is not actually the strongest argument put forward against a single couple bottleneck since we diverged from chimpanzees. For that, we have to look more closely at Trans-Species Variation.

Trans-Species Variation

Human Variation and Trans-Species Variation are related but different. To measure human variation, we look at a large number of human sequences. To measure trans-species variation, we look at a large number of both human and non-human sequences, usually chimpanzee. From looking at this data, we might find evidence of alleles that appears both in chimpanzees (for example) and humans.

This figure illustrates what appears to be happening:


https://humgenomics.biomedcentral.com/articles/10.1186/s40246-015-0043-1

The key point is that along each of the colored lines, several lineages are being shared between different species at a single place in the genome. Normally, there would be just one lineage on these time scales, but balancing selection maintains multiple lineages of alleles. By counting the number of allele lineages shared between humans and others, we can put a hard-stop lower bound on a bottleneck going back before humans and chimps diverge. Whatever bottlenecks there are they have to be big enough to include all the trans-species lineages.

Molecular Clock Not Valid

One tempting argument, which is not quite right, is to just estimate the TMRCA (or TMR4A) of these alleles, the same as we did across the genome, and use this as an estimate of a bottleneck time. This however, is an error.

Something called “balancing selection” is critical for enabling variation to last long enough to be shared this long between humans and other species, and this usually happens in proteins important for our immune response. So we see trans-species in only a few regions of the genome.

However, balancing selection violates the conditions required to accurately date variation in DNA. We cannot use our formula D = R * T here, because, in this case, we do not have a valid way of estimating R over these time frames. While in neutral regions of the genome, the average mutation rate works in our favor, at times we expect balancing selection to be increasing the rate of change in unpredictable and untestable ways. This can happen very rapidly as balancing selection can even select for increased mutation rates within this region.

Ayala’s Argument Against a Bottleneck

The argument here is two part. First, from effective population size estimates, and second from trans-species variation. I’m not going to engage the argument about effective population size, because it appears to be incorrect. Very tight bottleneck can still have high effective population size, and it seems Ayala missed this point. But this just takes us back to the TMR4A work.

This is where trans-species variation becomes important. It gives an independent way of dating alleles. If an allele in humans is closer to non-human alleles, it appears that it existed before those two species diverged, and was maintained by balancing selection to this day.

This study by Francisco Ayala was the first, to my knowledge, to make the case against a bottleneck by studying trans species variation HLA alleles. https://www.sciencedirect.com/science/article/pii/S1055790396900135

This figure from Ayala shows human alleles with other primate alleles joined by similarity, not phylogenetic analysis that respects nested clades. I’ve highlighted the human alleles in this figure, and drawn red circles around 7 clusters of alleles which appear to be shared between human and other species. Remember, we can only put 4 alleles at each position in the genome of a couple, so this seems (at least on face value) to demonstrate there must have been at least 4 individuals in the tightest bottleneck of our ancestors.

Ayala’s summary is:

Figure 4 is a genealogy of the HLA alleles obtained by the UPGMA method, which assumes constant rates of evolution and thus aligns all 19 alleles at the zero- distance point that corresponds to the present. The ge- nealogy suggests that 8 allele lineages were already in existence 15 Myr ago, at the time of the divergence of the orangutan from the lineage of African apes and hu- mans; and that 12 allele lineages were in existence 6 Myr ago, at the time of divergence of humans, chimps, and gorillas.

The difference between his numbers and mine in how we determine lineages. There is some ambiguity in how we determine the cutoffs. Still, as long as we see more than 4 lineages with trans-species variation, its seems like evidence against a single couple bottleneck. From this, he argues,

There is, however, no evidence supporting the claim that ex- treme bottlenecks of just a few individuals, such as postulated by some speciation models (Mayr, 1963; Car- son, 1968, 1986), have occurred in association with hominid speciation events, or with major morphological changes, at any time over that last several million years.

This is probably correct, in that there is no evidence for a bottleneck that I can see. But he means here to mean that a bottleneck has not happened: i.e. there is evidence against a bottleneck in the last several million years. That may be incorrect.

Some Technical Asterix

Generally speaking, this work has been understood in the field to definitely discount any notion of a single couple bottleneck. On face value, that is certainly what it looks like. However, there are some big caveats.

  1. The molecular clock based dates computed in these studies, it does not appear to be well calibrated.
  2. We do not really know the confidence on any of these clusters, because Ayala did not estimate them using modern bayesian methods.
  3. He also used a similarity based method to build the trees, rather than a true phylogenetic reconstruction. This is important, because it can produce different clusters.
  4. It does not appear convergent evolution was accounted for in this analysis. Convergent evolution, at this level, can create the appearance of shared history when there is none.
  5. His population simulation used a bottleneck lasting 10 generations (e.g. 10 individuals for 10 generations), which is much longer than the bottlenecks we are considering (e.g. 2, to 10, to 500, to 2500, to 12500).

While these are interesting results, at some point, this analysis needs to be done with better methods to really determine how many lineages are persistent over the last 6 mya. Moreover, effort to correct for convergent evolution is important here too. On the simulation size, a brief bottleneck needs to be considered, rather than just those of 10 generations.

A Finding Not Replicated

Ayala focused his work on HLA-DBQ1 (one of the MHC genes), but similar work has shown trans-species variation at other locations in the genome. However, I could not uncover a single other study that shows more than 4 lineages with tran-species variation.

I cannot do a full review here, but we can see the balancing at other genes, with fewer lineages in the end. For example…

https://www.ncbi.nlm.nih.gov/pubmed/10866107

This figure is fairly typical of findings…


https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4072476/

This figure shows a molecular clock based estimate (which do not appear well-callibrated) of 7 lineages at 6 mya, however, less than four lineages (0 in this case) is shared with chimpanzee. Reviewing several papers, I cannot find replication of Ayala’s findings of more than 4 lineages being shared between humans and other species.

We can see this pattern in this figure too…

image
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4072476

Here, the bold leaves are human sequences. Notice the difference between this figure and Ayala’s. There is numbers on the edges (which indicate confidence) and we just do not see nearly as many lineages in common. The authors here conclude there is just one lineage in common.

Here is another typical results figure:

image
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3612375/

Each tree is a different region of the genome. Notice, again, that there does not appear to be more than 4 clusters with both human + chimpanzee alleles.

While Ayala is an established scientist, his work was done in 1996, well before modern sequencing efforts, and modern bayesian analysis of phylogenetic trees. While no one has published on DBQ1 since he did, it is very surprising that no one else has replicated his result in the last 22 years on another locus. Of course, if someone can find a study that does, please let me know!,

The apparent failure to replicate this finding (with (1) much more data, and (2) improved methods), discounts substantially my trust in his findings. We just know much more about how analyze these sequences, and we have so many more of them. It is not surprising that our understanding might advance.

One Line of Evidence? One Paper?

At the moment, the Ayala paper appears to be the only study which shows more than 4 allele lineages with trans-species variation. His analysis, however, did not estimate confidence nor did it use phylogenetics to determine lineages. In 22 years, I cannot find a paper that replicates his finding. Certainly, trans-species variation has been observed, but not more than 4 lineages, as far as I can tell.

This is not enough evidence by which to make a confident claim against a single generation bottleneck.

The Way Forward

The right way forward, then, is to to study trans-species variation with the data we have now, but better methods than did Ayala. This takes some difficult work, however. I’m not 100% sure if we will give it a try here, but we might. This, also, is the most likely place a future study might uncover evidence against a single couple bottleneck.

Until that happens, however, I am not sure this is strong evidence against a brief bottleneck. I stand to be corrected, however, if someone can produce a study that shows this. If you find one, please send it to me.


(Peaceful Science) #618

Please, I want to know if I am wrong here. If you can find such a study, please let me know. Correct me if you can!