Determining similarity statistics between the human and chimp genome

Frank · February 15, 2017, 6:10pm

Hi Mervin,

I had intended to make a response to your comments but certain persons objected to my defence of Dr Jeffrey Tomkins in the issue of the similarity between chimp-human with the consequence that my account was ‘suspended’.

Therefore, you’ll understand that I was unable to post a response to you.

In the meantime I contacted Dr Jeffrey Tomkins and apprised him of the situation.

In his very recent email to me he very graciously explained where those who disagree with his findings of only an 85% similarity have erred in their findings.

He sent me the following lengthy article which is his latest research in which he drills down into the details and provides evidence for the similarity in chimp-human as being significantly less than that of scientists who are theistic evolutionists with an a priori commitment to Darwinian evolutionary processes.

Summary

In regard to data sets that included run date information, two different sets of chimpanzee DNA sequences related to the Sanger-style data sets used to construct the chimpanzee genome exist. The sequences that were produced early on in the chimpanzee genome project that contributed to the initial five-fold coverage of the chimpanzee draft genome (Mikkelsen et al. 2005), are significantly more similar to human than those that were produced later in the project by a difference of about 5% overall data set sequence identity. Contributing to this difference is the additional fact that a 5.6% difference in the amount of sequences that hit onto the human genome also exist.

When not considering run date, but instead including all sequences, two bins of data were constructed: data sets with overall identities below 90% and those above 90%. In doing this, the difference in sequence identity between the two data sets widened to 7%. This is largely due to the fact that the sequences lacking run date information were the most highly similar to human out of all the data sets. Because these data sets all contained filename numbers between 13 and 48, it is safe to assume that they contributed to the initial rough draft of the chimpanzee genome in 2005, inflating its human-like characteristics accordingly.

It may be that greater precautions towards human DNA contamination were taken later in the project producing less contamination. If the data from these seemingly less contaminated sets are considered, the chimpanzee genome is no more than about 85% similar to human. If all the data sets taken together are considered, despite the apparent human DNA contamination, then the chimpanzee genome is no more than about 90% similar to human.

It is very probable that the current chimpanzee genome assembly suffers from two major problems that make it more human-like that it should be. First, chimpanzee DNA sequences from both Sanger-style sequencing and next generation sequencing technologies, have been assembled using the human genome as a reference framework (Mikkelsen et al. 2005; Prado-Martinez et al. 2013). In other words, the chimpanzee genome does not stand on its own merits using its own framework-based genomic resources (e.g. an accurate integrated physical-genetic map for chimpanzee) as I described in an earlier publication (Tomkins 2011). Second, given the fact that significant levels of human DNA exist in non-primate databases due to laboratory and worker contamination (Longo et al. 2011), the potential for human DNA in the pre-assembled chimpanzee sequencing reads is highly probable and could be tested for by simply comparing the chimpanzee-human BLASTN analyses of the different data sets one to another. The main questions would be, are there significant differences between data sets, and are there any obvious patterns for these differences? The answer to both questions is a resounding yes.

In determining this, 101 Sanger-style publically available trace read data sets were downloaded, providing the longest possible trace read data source, were end-trimmed for low quality bases, and purged of contaminating plasmid cloning vector sequence. Then, 25,000 sequences were selected at random from each data set and queried against the human genome using BLASTN v2.2.31 with liberal gap extension. Results from the BLASTN analysis indicated that two different groups of chimpanzee DNA sequences existed. Those that were produced early in the chimpanzee genome project that contributed to the initial chimpanzee genome publication were considerably more similar to human than those that were produced later in the project by a difference of about 5%. It may be that greater measures towards alleviating human DNA contamination were performed as the project progressed. Data from the seemingly less contaminated sets indicate that the chimpanzee genome is no more than about 85% identical to human.

Furthermore, when chimpanzee sequences that did not hit onto the human genome were blasted against the chimpanzee assembly, the average alignment identity was only 85% when 99.9 to 100% identity should have been the result if the chimpanzee genome was accurately assembled.

Swamidass · February 15, 2017, 8:52pm

I would love to talk to Jeffery Tomkins! How did you get his contact info? Can you put him in contact with me? (my emails is easily found at http://swami.wustl.edu/

We can start from the chimp reads, and then blast them against the human assembly (so it isn’t biased in the way Tomkins asserts), and the similarity is much stronger than he reports. There are two controls that make obvious the errors in his analysis.

Take similarly collected human reads and put it through the same analysis to blast to human genome. You will see that human reads are nearly as “far” away from the human assembly as the chimp reads. There are many reasons why this is the case, but the are really beyond the scope here. The fact that the human reads will be almost equally far from the human genome is a clear sign of systematic error in his analysis.
Do the same experiment with rat reads to the mouse genome assembly (or vice versa). You will see that mice/rat will be much much more different than human/chimps. That gives us an reference from which to interpret the similarity.

Tomkins is welcome to do this analysis. It is quite easy, and science requires controls. These are (at minimum) the required controls. Until he does this, there is no way to trust his analysis.

Once again, this is really easy to actually do for yourself. I have. I know the answers here, so this is why I do not trust his result.

Moreover, if he was right, and I could verify it. I would get a Nature paper out of it. That would be really big and exciting news. So, of course, I checked it for myself. I would want the Nature paper!

sfmatheson · February 15, 2017, 9:12pm

Ha! They’d name a building after you at Wash U!

glipsnort · February 15, 2017, 9:16pm

Sure, for a Nature paper and a few hundred million dollar donation, I’m sure they’d go for that.

ETA: we’re in the final throes of submitting a paper to Nature, and I’m getting punchy.

Frank · February 16, 2017, 9:10pm

Frank: So, you’re a scientist and you’ve researched this issue and conducted experiments that ‘prove’ Tomkins wrong – but you can’t obtain his email address?

How about doing two things?

Firstly, how about drilling down into all the details contained in Tomkins latest article and stating in precise detail just how, in your opinion, he has erred?

Secondly, how about posting an article of your own personal step-by-step research and experiments?

If you do so I’ll be pleased to ask Jeff Tomkins if he want to respond or to contact you directly. How’s that?

Jay313 · February 16, 2017, 10:34pm

Pretty lame. Just click on Swamidass’ icon and send him a private message with the email address. Unless Tomkins has a reason to hide from scientific colleagues with genuine questions about his work …

Socratic.Fanatic · February 16, 2017, 11:53pm

I’m not sure why this post is directed to me. Perhaps you were thinking of someone else. (??) I don’t know what “this” is that was a “sensible inference”.

GJDS · February 17, 2017, 12:20am

This is a portion of your post to me:

“Indeed. Genesis says that Adam was the first Imago Dei creature. But that doesn’t require that there were no other hominids at the time. It is also possible that Adam could be AN ANCESTOR of all humans today but not be the only hominid of that era from which we are descended. Accordingly, all humans today could …”

My response was to that post by you…

Socratic.Fanatic · February 17, 2017, 2:49am

I guess I don’t see it. The text tells us that other hominids existed, such as when contrasting the sons of God and the daughters of men, the meaning of the word NEPHILIM, and the fact that Cain went off to another area where he took a wife and built a city, which needs people/hominids. (He also needed a “mark of Cain” so that those who wouldn’t recognize him—that is, those not of the immediate Adamic family—would know not to kill him.) Yes, to me, the text shouts other hominids.

“The Sons of God” is exactly the term we would expect in describing another “race” of hominids who were far larger and/or stronger. (We see this linguistic phenomenon throughout human history. Bigger and stronger ones of another tribe are often described using such superlatives: as in “children of the gods”. )

Besides, it wouldn’t make any sense to have Cain marry a sister or niece and having to go to another land to do so—especially when the human race is allegedly so tiny.

Of course, it is also another case where my interpretation of the text doesn’t clash with my understanding of science. God gave us a human genome which tells us much about human history. So I know that Adam and Eve were not the ONLY ancestors for all of today’s humans. They couldn’t be only. It takes a large population for genetic diversity.

Some may resort to an easy complaint of “that’s concordism.” No. It is not concordism when we notice that the world God gave us is not contradictory. So I’m not surprised when the Bible and modern science are not in conflict, just as I am not surprised when the Theory of Evolution is entirely in harmony with the evidence we see all around us. I don’t believe God gave us a creation that is full of misleading evidence. I trust the human genome to be an honest telling of human history. I refuse to believe God would plant within our genome evidence of a history which never happened.

So when the scientific evidence tells me that Adam and Eve were not the only hominids in existence at that time and the Bible speaks of other hominids (even without as much clarity of description which some would prefer), I have no reason to turn it down. Consilience of evidence is a good thing. I’ll take it.

Swamidass · February 17, 2017, 3:18am

No, I am a scientist (see here http://swami.wustl.edu/ ) that has done experiments so I can understand how the world really is, regardless of what Tomkins or anyone else says. This is my area of study too, so I want to serve the church by having an accurate understanding of the science here.

His email is not obvious to find. Even if it was, my experience has been that (I presume) after a google search, people in his situation do not answer. They usually only engage when someone like you asks them too.

Sure.

Let me start with what I agree with him about. He is 100% correct that the chimp assembled genome is “humanized” because it is assembled using the human genome as a guide. This, without doubt, will bias similarity measurements upwards. How much, we do not know without testing it. He is also correct that focusing just on coding regions biases similarity upwards too. How much, we do not know without testing it. And his overall approach, of checking the similarity of reads to the genome assembly is a good study design to get an unbiased estimate. That is all well motivated critiques and a reasonable solution.

The aligning reads to the genome strategy (as far as I can tell) was first proposed by Todd Woods several years ago, and the general experimental design is great. Down sampling to 25,000 reads is okay too (as long as you are sure they are really random reads). He could probably downsample to 1,000 random reads and get the same answer much more quickly. Whatever the case, this part of the experimental design is great. No arguments here, and this is the same way I have tested it on my own.

Also, his long term goal to do a complete de novo assembly of the chimp genome is reasonable. He actually has all he needs to do this already. But in time, with new technology, this will be easier to do and harder to screw up. This is such a worthy goal, it is very likely someone else will do it before him. Nonetheless, that is a exactly the goal he should be aiming for.

But then come the errors. These are very large errors that make his results uninterpretable and, therefore, unpublishable in scientific journals. And to be clear, if he was right, this would be publishable. I would go get it published, and it would be big news. What were his errors: (1) he did not do any control experiments; these are what make most obvious his conclusions are wrong; (2) he hypothesizes about the trend he sees (caused by contamination of the chimp samples with human DNA), but does nothing to test for the clear patterns this would show in the data; (3) he hypothesizes that chimp genome is misassembled because the human-unalignable chimp sequences do not match the chimp genome very well, but does not test this theory against more likely alternatives (4) these results directly contradict his earlier work in the field (which computed 88% sim on the earlier chimp reads, rather than the 95% he computes now), so which of these results are we to trust, and what went wrong last time?

So let’s unpack some of these here.

No controls. Here are the key controls he has to do at minimum to convince us. To be clear, these are all very easy experiments to do (they just take computer time). I’m not asking for anything outside his technical expertise or outside the scope of the study.
In the same time course analysis, run human reads against the human genome, and chimpanzee reads to the chimpanzee genome. This is a positive control where we expect the similarity to be very high if his methodology is reasonable. The amount of difference we calculate between human reads and the human genome (and in the chimp case too) is the “systematic” error of his approach, and it is expected to be substantial. For example, it is well known that many human reads do not map to the genome because they are from sequences in non-assembled parts of the genome. Tomkins counts all these sequences as unique to human even though human reads have them too. That is an error. There are other major sources of error here including sequencing error which (surprising to some) has gone UP on a per-read basis in the time frame he is looking at. How much do these effects bias similarity downwards? Quite a bit, and the way to measure it is by blasting human reads to the human genome (and chimp to chimp). THESE TWO STUDIES ARE THE MOST IMPORTANT OF ALL CONTROLS AND ARE CRITICAL.
Another good control would be to do the chimp reads to the chimp genome. Once again, this would give an estimate of the systematic error. It should be about the same as human read to human genome comparison in the same year, which would give him a higher confidence estimate about the downward bias his estimate.
In the same time course, blast rat reads against the mouse genome. Do we see the same downward trend at the same magnitude? If so, how do we explain that if the reason is human DNA contamination? Why would human DNA contamination make rat reads look like the mouse genome? Also, what is the similarity of mice and rats compared to the similarity of humans and chimps? How do we make sense of this in the YEC framework?
As a sanity check, run mouse reads against the rat genome. Do we get the same results as #2? All the same questions apply here.
He hypothesizes the reason for the downward trend is because of human contamination is being better handled over time, and points to some anecdotal evidence to this effect, but that is not really enough. He has to test it.
He can see if (within in a chromosome) there is a bimodal distribution of read-to-genome similarity, with one peak right near 100% minus systematic error, and the other at about human-chimp similarity minus systematic error. If see this in the data that would support his case, and also allow him to estimate how many of the sequences are human contaminants (which would unfairly bias the similarity upward) and how many are non human/chimp contamination (and therefore are biasing his “overall similarity” computation downwards.
Assuming he sees this bimodal hump, does its size account for the discrepancy he sees between the early years (with 95% similarity) and the later years? Does he see the human hump of contamination data reducing steadily?
Can he show the same trends in human contamination in mice and rat reads too? Do those trends match the pattern.
If he can show these things match the clear predictions from his hypothesis, then perhaps he may be right on this point. That is important information to have too, and really important scientific work. However, by just throwing this out this hypothesis is not convincing when there are other explanations (gradual changes in technology over time that alter systematic error in these studies), He needs to actually test the hypothesis, and as I have shown it is imminently testable with his setup. If he gets this right, other people will follow up with more rigorous confirmatory studies that are likely beyond his computational resources (they are hard studies to do too). If they prove him right though, he will get credit.
His theory that the human-unalignable chimpanzee reads that do not align to the chimpanzee genome because the chimpanzee genome is poorly assembled, but he does not consider alternate theories that are much more likely
it turns out that some types of DNA are much more difficult to reliably sequence than others, and have a higher error rate. They are also much harder to assemble (because they are repetitive). This shows that there is error in sequencing and assembly. All of this error (in Jeff’s computation) is chalked up as differences between humans and chimps rather than seeing if these are the root causes.
What is the sequence composition of these sequences?
The main assembly error he has identified in the paper is the use of the human genome as a guide. But this portion of the genome is not biased by the human alignment because the human genome is not assembled here. Why does he think this happened if not because of error in the sequencing technology being higher on these sequences? Once again, the controls I discussed in point 1 will also clarify that the same thing is happening in the human genome. Much of the drop in accuracy there can be explained as misassembly of the human genome too, but he just counts it as human-chimp differences. That is an error.
Most importantly, this work directly contradicts his widely quoted earlier work on the same data with nucmer, lastz and blast comparing chimp reads, which uses the earlier chimp data to compute a similarity of 88%, instead of 95% as he does in this paper.
How does he explain this discrepancy? If he admits that earlier analysis was wrong, will he admit error?
It is true that he now uses gapped BLAST parameters when in the past he didn’t, and that explains the BLAST results he got before, but people have told him this for years. Why did he change his mind? When he did realize this? It currently looks like he knew this early on, but waited till he get an alternate way of coming to the “right” conclusion before he published the change. Unless he explains and harmonizes the earlier result (or retracts it), it is hard for some of us to trust his work. For example, I am concerned that he would do the analysis I am suggesting here and find that I am right, but not actually publish a retraction.
The gapped parameter change does not explain the difference with the nucmer results. I’ve already explained the reason for this, but how does he harmonize these results. Publishing results that entirely contradict earlier work needs to be directly explained.

So that is my key critiques. And to be clear, these are just stating the standards I would expect of all my scientific colleagues. And I am asking for analysis that is entirely within his means to do. In fact, it is entirely possible he has already done the analysis. I encourage him to do the analysis and publish the results.

That analysis, if it supports his conclusions, would encourage others to look at the data for themselves and reproduce his analysis and validated in additional ways. If it was reproduced and validated, this would actually change how mainstream scientists believed about this question.

I was planning on doing more than that eventually. I’ll probably make the code available for you to run it yourself. Remember though, you said that you had no interest in this. It takes time to write things like this up, and I’m not sure if it is worth it if people like you end up being evidence immune. Your refusal to take me up on the offer to test the data yourself made me rethink things.

In fact, even with all this analysis and explanation, I fear you will claim I am attacking him. I am not. I am just answering your questions. You wanted to know my critique. Here it is. And I am even showing you how he can fix the errors in his study. It is now up to him to do the work to make his case. Best of luck to him.

Swamidass · February 17, 2017, 3:52am

Also @frank, I remember you were quoting this 88% figure in this very thread. Yet, here this study actually show totally different results. I hope you can see the challenge for us here in communicating this to you.

GJDS · February 17, 2017, 4:53am

It has been some time since I have read Genesis in great detail, and from my recollection, I do not see anything in your comment that jumps at me - the major point is that knowledge of God commences with Adam and Eve, and this continues (with all of the difficulties related to how human beings behaved since then), until the covenant with Abraham. This central theme provides the basis for the centrality of faith in God, and eventually the mature understanding of Grace from God. All of this culminates in an understanding that God has been, and continues to, create sons and daughters in Christ. I think this is the context, and the entire creation is there for that purpose. Thus we may understand Paul who talks of Adam as the first man, and Christ the final perfection as the son of man.

From these scant comments, I imply that ToE in whatever form that it is presented, is unimportant, and most certainly does not show how God is carrying out His divine purpose. That is not to deny ToE is accepted by evolutionary biologists - but I can say, in a similar way, the theory of chemical bonding is unimportant in the context I describe, as it certainly does not show how God is carrying out His divine plan of creation. My remarks do not suggest conflict between any science and faith, but instead I see such things “talked up” to sustain a culture of conflict by both militant atheists and aggressive ECs.

Socratic.Fanatic · February 17, 2017, 6:36am

Thank you, GJDS. That clears up a lot. I see what you mean.

Yes, it is important to focus on the theological purpose and to minimize unnecessary cultures of conflict. I like that term! I’m often amazed at the depth of venom that can be exchanged in some forums when the sides do battle.

Just yesterday I was informed that, because I “believe in billions of years”, I could not be a True Christian™. And as often happens, it didn’t stop at that boundary. I was also told that I was “a servant of Satan” and that the person was going to take great delight in the thought of me burning in hell for all eternity. I admit that every time I hear something like that, it scares me a bit because I realize that it was that very mindset that made it possible for “devout Christians” to cheer with glee at burnings of “heretics” and to torture people during “interrogations” and many other practices I won’t list. And last night I was reading an article entitled “When Americans banned Christmas.” This sentence leaped out at me:

In 1686, the royal governor of the colony, Sir Edmund Andros, sponsored a Christmas Day service at the Boston Town House. Fearing a violent backlash from Puritan settlers, Andros was flanked by redcoats as he prayed and sang Christmas hymns.

I try to picture that: The governor of the colony has to have soldiers protect him from the Christian citizenry, the Puritans, as he prays and sings Christmas hymns. Their hatred for a Christian who chooses to pray and praise God for the birth of the Savior is so potentially violent that the governor needs armed bodyguards. It would be bad enough if they objected to singing and dancing in celebration of the Nativity. But to think Christians could be angry about prayers and hymns in a public celebration boggles the imagination.

I can’t claim to grasp it. Furthermore, I try to think back to the 1960’s when the “creation science” movement caught on and I was basically in the thick of it. Yet, I don’t recall the kind of angry animosity I see today. The vitriol scares me, even though I realize there is blame on more than one “side” and that most Christians are not like that. It is very sobering. A “culture of conflict” is something we don’t need.

GJDS · February 17, 2017, 6:50am

Your concerns are not unique and I have often wondered at how anyone can claim to follow Christ and not abhor violence. However, I also see the other side of such conflict, commencing with forced conversions (I think there are instances when a king simply declared all of his subjects become Christians, and if they did not, were punished). We have gone from that extreme, to one nowadays when people take legal action to prevent public prayer, and consider religion as evil. None of these conform to the teachings of the Gospel, so I would question the claim these acts are perpetrated by Christians - nor are the secular fighters able to save people from faith.

I guess it all boils down to this: if I, you, or anyone else are motivated by fear and hatred, we fall short of the Gospel and we need to repent, if we wish to continue as Christians.

DennisVenema · February 17, 2017, 6:52am

@Swamidass @glipsnort @sfmatheson

I’ve only just skimmed this recent offering by Tomkins, but is he averaging hits again? I.e. is he taking the average identity, with all of the attendant problems this has for repetitive DNA?

Seems like he is at first blush, but I haven’t had time to read it carefully yet.

Socratic.Fanatic · February 17, 2017, 7:02am

I’m slowly working through this thread and it is entirely possible that I’m asking a question that has already been addressed—so I’ll try to be brief. The focus here seems to be on the exact percentage of similarity in human and chimp genomes:

Perhaps I’m naive as well as limited in my knowledge—in fact, I’m sure of that—but I thought that the ultimate importance of genome comparisons is in the nested hierarchies, the confirmations at the molecular level of the tree-like structures one sees in the traditional phylogenetic trees constructed for many years before. Surely the exact percentage of similarity in the genomes is less important than the nested hieriarchical relationships in the data. Right?

To put it another way, I always thought that the raw “quantitative” matching between two species’ genomes is far less important than the “qualitative relationships” in the matches. Am I wrong? Am I missing something?

I guess I’m prone to agree that, yes, the 95% figure in and of itself does NOT “prove” common ancestry. Yet, the fact that that 95% is just a summary of much more important “qualitative” data which establishes HOW (and not just “how much”) the two species are related—and therefore share a common ancestor—is far more important. Is that the case?

If I’m correct in that understanding, it would seem to me that the question of whether the exact percentage of quantitative similarity is 95% or 92% or 98% or whatever would seem of far lesser importance.

Considering how much effort is often put into debates over the exact percentage of similarity between two species, I would appreciate some help from the experts.

DennisVenema · February 17, 2017, 7:10am

Yes, the exact percentage identity is a bit of a red herring. If the chimp lineage had gone extinct, we would be talking about gorillas and our (slightly less) identity that we share with them. If an australopithecine lineage had survived to the present day, we would be talking about a identity greater than that with chimps. It’s all relative.

Frank · February 17, 2017, 12:55pm

Frank: You’ve made a really excellent point about the relevance of the actual percentage similarities in what is after all the issue of whether or not chimps and humans are the result of a common ancestor or a common designer.

That is the issue, is it not?

But do permit me to make a valid point I’ve made on, I think, 3 occasions and which has been ignored.

IF the similarity in chimp-humans is a high as 95 percent that equates to approximately 150 Million differences. By anyone’s reckoning that’s a huge number of differences, is it not?

Anyway, to address the central point in your post: “the ultimate importance of genome comparisons is in the nested hierarchies, the confirmations at the molecular level of the tree-like structures one sees in the traditional phylogenetic trees constructed for many years before.”

You then go on to say: “To put it another way, I always thought that the raw “quantitative” matching between two species’ genomes is far less important than the “qualitative relationships” in the matches. Am I wrong? Am I missing something?”

Firstly, you need to explain what you mean by “qualitative relationships in the matches.”

When you say: “the confirmations at the molecular level” this suggests to me that you are of the belief that chimps and humans are “nested hierarchically” on the basis of their genetic similarity.

But this isn’t actually the case. As far back as 1735 Carl Linnaeus in Systema Naturae named humans Homo sapiens, and placed us in the genus Homo. He also placed orangutans and chimpanzees, the two apes known at the time, in the genus Homo.

And he placed Homo in a family, which he dubbed Primates. Primates also included two other genera, simians and lemurs. Although Linnaeus believed that humans were special beings in God’s creation, he slotted our species into his system as if it were any other.

Secondly, it very much seems to me that you place a great reliance on “nested hierarchies”, “phylogentic trees” and other classifications as evidence for a common ancestor.

In so doing you are making the assumption that nested hierarchies are evidence of common ancestry.

I’m speaking as a former atheist and would care to suggest that you may want to give very careful consideration to challenging your assumptions because not all may actually be that which it appears to be.

Below is a relevant extract from: Do All Life Forms Fall into a Nested Hierarchy? | Evolution News

“Together, these authorities make a crucial point: cladistics and other phylogenetics methods do not demonstrate common ancestry; they assume it. In other words, these methods don’t test whether all organisms fit into a nested hierarchy (i.e., phylogenetic tree). Rather, evolutionary systematics assumes that common ancestry is true and therefore all organisms belong within a nested hierarchy, and then it uses methods to force-fit any organism into the tree, even if that organisms has traits that don’t fit neatly within the tree. Thus, Michael Syvanen – a rare evolutionary biologist who is open to the possibility that universal common ancestry is false – laments the pro-tree biases of treebuilding algorithms:

Because tree analysis tools are used so widely, they tend to introduce a bias into the interpretation of results. Hence, one needs to be continually reminded that submitting multiple sequences (DNA, protein, or other character states) to phylogenetic analysis produces trees because that is the nature of the algorithms used.”

(Michael Syvanen, “Evolutionary Implications of Horizontal Gene Transfer,” Annual Review of Genetics, 46:339-356 (2012) (emphases added).)

Common ancestry, therefore, is a starting assumption about the data – not a conclusion from it. Another key lesson is this: just because you see evolutionary biologists creating an impressive-looking phylogenetic tree doesn’t mean that all of the organisms or their traits shown within that tree fit neatly into a nested hierarchy (i.e., a tree structure).

Patrick · February 17, 2017, 1:09pm

Yes, Dennis, it is a red herring. What percentage similar are each of us to a Neanderthal, a Denosivan, a Floresensis? As your book nicely explains, diversity of our genome is explained by decent from a common ancestral population. My genome is quite similar to my parents and my grandparents, but less so to my many great great great grandparents.

benkirk · February 17, 2017, 1:36pm

No, it is not. Perhaps you should look at how many differences there are between humans?