Testing Common Ancestry: It’s All About the Mutations

system · December 4, 2017, 9:51pm

This is a companion discussion topic for the original entry at https://biologos.org/blogs/guest/testing-common-ancestry-its-all-about-the-mutations

DennisVenema · December 7, 2017, 2:55pm

Nice work, Steve - very well explained.

Joel_Sam · December 7, 2017, 4:36pm

I believe there might be a typo in paragraph 4:

“(3) A difference between G and T (G↔C)”

I think this is supposed to be “(3) A difference between G and C (G↔C)”

BradKramer · December 7, 2017, 8:05pm

Good catch. Fixed.

glipsnort · December 8, 2017, 1:01pm

Someone asked me by email what patterns in these data would be inconsistent with common ancestry, i.e. would cause me to reject that hypothesis. This was what I replied:

If I were to turn this into a formal test, I would make it a likelihood ratio test, which is a way of comparing two models to see which explains the data better. One model would be that all inter-species differences result from mutations, and that relative mutation rates of different kinds can drift over time; the rate of drift I would estimate from data on how mutation rates differ between human populations. The second model would one in which genetic differences could form any pattern at all. It would be straightforward to calculate how likely the observed data would be under the two models.

I would probably repeat the comparison, using for the second model one in which all classes of genetic difference were equally likely, i.e. all the columns in my plots would be the same height. If all the heights in the human-chimpanzee comparison really were closer to being equal than to the heights in the human-human plot, then it would be evidence against common ancestry. (More precisely, it would be evidence against common ancestry with mutation rates that do not change rapidly.)

I would also confine the analysis to parts of the genome most likely to be nonfunctional.

T_aquaticus · December 8, 2017, 5:13pm

This is one aspect I have been thinking about.

One possible response ID/creationists could have is that God produced the sequence differences in genes (i.e. functional DNA) but let the rest of the genome diverge through random mutations. Since the overwhelming majority of the genome is non-functional the designed mutations in the functional bits of the genome would be swamped by the random mutations in non-functional DNA.

What would it look like if we just compared functional DNA? Should we see the same bias for specific types of substitutions that we see in non-functional DNA? Does negative selection on non-synonymous mutations introduce a bias of its own?

Just some thoughts . . .

glipsnort · December 8, 2017, 7:48pm

This would require, though, that human and chimpanzee nonfunctional DNA were originally nearly identical, and also that human and gorilla nonfunctional DNA were originally nearly identical, and that human and baboon nonfunctional DNA were originally nearly identical, and so on, even though human/baboon divergence is something like five times human/chimpanzee divergence. I don’t see how it can be made to work.

T_aquaticus · December 8, 2017, 9:18pm

I completely agree that the data rules out separate creation events, with the caveat that God wouldn’t purposefully design organisms to look like they evolved.

Just for the sake of argument, let’s say that we adopt Behe’s view:

Common ancestry is true, and the natural history described by the consensus view is largely correct.
God introduces beneficial mutations in genomes throughout their development and history.
Random mutations also occur.

Let’s also assume that the mutations God introduces are numerous enough that we could detect them in some manner. We would also assume that the vast majority of beneficial mutations that God introduces would be found in the fraction of the genome we currently consider to be functional.

In this scenario, could we predict what type of substitution bias evolution would produce in functional DNA and then see if those predictions hold up? Could a statistically significant deviation from the predicted pattern at least point away from random mutations being the cause for the differences in functional DNA?

I also know that you are a busy dude, so don’t feel like you have to answer these questions. I really liked your article and the figures you put together. I am just trying to anticipate what the possible criticisms from the ID crowd are going to be. It’s also a healthy exercise to try and take apart arguments you agree with.

Edward_T_Babinski · December 9, 2017, 4:28pm

Schaffner! Many years ago I ran across this argument of yours:

'Where is the creationist or I.D.ist model that explains the following types of observed genetic data? Such a model should produce estimates of the following measurable genetic data for modern humans:

'The minor allele frequency spectrum.

'The relationship between minor allele frequency and probability that the minor allele is the same as the chimpanzee base at that site.

'The ratio of transition (purine<->purine or pyrimidine<->pyrimidine) polymorphisms to transversion (purine<->pyrimidine) polymmorphisms.

'The ratio of polymorphisms at CpG sites to the overall polymorphism rate.

'The distance over which significant linkage disequilibrium extends in a chromosome.

'The genetic distance (difference in allele frequencies) between African and non-African populations.

'The difference between African and non-African populations in the extent of linkage disequilibrium.

'The distance over which significant autocorrelation in heterozygosity extends in a chromosome.

'The ratio of fixed transition to transversion differences between humans and chimpanzees.

'Same as (9), but for CpG sites.

'There are other possible questions, but these are a reasonable starting point, since the quantities in question are all ones that I routinely use evolution to predict or intrepret. If the claim is true that creationists/I.D.ists look at the same data and just interpret it differently, there should be no difficulty in providing the creationist interpretation of these data.(Note that the answers should be derivable by anyone using the same model.)

‘Iʼm happy to answer questions about my list (which is deliberately terse — I didnʼt feel like writing a survey of population genetics). Young-earth creationists should have the most trouble meeting my challenge. As you allow more and more time, and more and more evolution, it becomes harder to distinguish special creation from evolution. In the extreme case where all God does is cause a small number of critical mutations in the development of humans, the results will look exactly like evolution (provided the mutations occur in a fairly large population). In that case, of course, you have to wonder why those mutations also couldnʼt have happened on their own, since every other mutation can.’

I reproduced your argument above in my blog post, 6 Reasons Why More Biologists Are Not Pro-I.D. (Intelligent Design) Scrivenings: 6 Reasons Why More Biologists Are Not Pro-I.D. (Intelligent Design)

Glad to see you still continuing to reason with creationists and those I.D.sts who deny that the evidence for common ancestry is strong, speaking of which see, Leading I.D.ists & Creationists Admit Evidence that Humans & Apes Share Common Ancestry Scrivenings: Leading I.D.ists & Creationists Admit Evidence that Humans & Apes Share Common Ancestry

Swamidass · December 9, 2017, 5:43pm

@T_aquaticus I agree, but let me add another view to your perspective.

This is not Behe’s view, but it is a reasonable approximation of Hugh Ross’s view.

Detection seems very unlikely.

Remember #2 (God introduced mutations) are much much less frequent than #3 (random mutations), by perhaps by a ratio of 1 to 1000 or more. We only think about 2000 mutations were selected for in, for example, our last 6 million year history, during which 30M mutations were fixed. Basically, God’s could have inspired some of those mutations, but would have been acting in the noise of the distributions that @glipsnort computed.

For non-humans, this all seems fine because the model you present is equivalent to just saying God inspires some mutations. There is no clear evidence for or against this because the signal to noise is so low.

For humans however it becomes implausible if insist on an adam and eve with no interbreeding because of the bottleneck problem. Of course if we are okay with interbreeding, as i have shown, there is not problem here either. Evolutionary science shows strong evidence for common descent, but tells us nothing about whether or not God inspired mutations, much as it tells us nothing about why the asteroid that wiped out the dinasours came when it did.

In humans, the signal is just thousands of mutations, against a noise of 30M. It would be very difficult (impossible) to detect or rule out such a signal. For that reason, also, we cannot rule it out. God could have intervened, but we cannot tell from the data. It looks like it is shaped predominantly by random mutations, and these mutations introduce so much noise that we cannot even reliably identify which mutations were important for function. Instead, DNA is much better at telling us our evolutionary history, and is not useful for discriminating the hypotheses you’ve put forward.

I agree @glipsnort. Great job.

glipsnort · December 10, 2017, 2:26am

As @Swamidass points out, we would not be able to pick out a modest number of functional changes introduced by something other than mutation.

T_aquaticus · December 11, 2017, 4:31pm

In reality this might be true, but I am thinking of it more along the lines of a null hypothesis. What would we need to see in order falsify the hypothesis that random mutations and selection were responsible for the divergence of functional DNA between species?

I also think it is incorrect to assume from the start that something isn’t detectable. I think it is worth it to at least look.

That’s why I was thinking of comparing just the mutations in functional DNA which would exclude random mutations occurring in the vast majority of the genome. We could even exclude synonymous mutations in functional DNA to further limit the data set. If we go with a less conservative value of 10% functional DNA in the human genome as determined by conservation between chimp and human, then we would be looking at about <3 million mutations between human and chimp depending on the amount of negative selection. Functional non-coding DNA would probably be difficult to model, so we may have to focus on coding DNA so we have a better feel for mutations that will have an effect (i.e. synonymous vs. non-synonymous mutations).

Or I could be off my rocker and am failing to see something intrinsically wrong about my approach.