Signal vs. Noise, Part 2: Hunter Opens the Klassen Study Again

T_aquaticus · July 27, 2018, 2:50pm

That study uses orthologous genes, so I’m not sure what you are getting at. How are you supposed to compare genes if a gene is not found in one of the species?

Since you can’t come up with a reason, other than common ancestry and vertical inheritance, why phylogenies based on DNA sequence would recapitulate the phylogenies based on morphology then I don’t see what objection you can have for using orthologous genes.

There is a statistically significant phylogenetic signal, so the predictions have been supported.

Chris_Falter · July 28, 2018, 9:29pm

Dr. Hunter,

Hope your southern California weekend is going well.

You chose to analogize the theory of evolution to geocentrism. I am going to choose a different analogy that I believe to be more accurate and informative: X-ray crystallography and DNA structure.

Just as evolution predicts a statistically significant nested hierarchy structure in a taxonomy, biochemistry predicts that DNA can take on structural forms known as A-DNA and B-DNA. The test of the hypothesis is the similarity of the predicted X-ray crystallography images to the actual. And here I introduce some predicted and actual images from an article on quora.com, “How does one physically interpret the different diffraction patterns between A-DNA and B-DNA?”:

Now it would be possible to build a consistency index for the predicted vs. actual similarity. The CI could answer the question: for each pixel that is dark in the actual image, is the corresponding predicted pixel dark? Sum up the number of pixels for which the correspondence is true and divide by the number of dark pixels in the actual image. This approach would be very similar to the CI approach adopted by Klassen et al. in 1991, except that they were analyzing characters instead of pixels.

Without access to the original data, I cannot provide an exact CI for the A-DNA and B-DNA images. However, there are clearly a lot more dark pixels in the actual image than in the predicted. I would guess the CI is roughly 0.5 for A-DNA and roughly 0.25 for B-DNA, which has enormous black blobs at the top and bottom where a thin segment of dots is predicted.

The question is: should the actual data be interpreted as evidence for the predicted structures of A-DNA and B-DNA?

Answer #1 is:

No, the actual pixels are poor evidence for the theory. Certainly there is some similarity between predicted and actual. But just as you have to introduce epicycles into geocentrism to account for planetary orbits, you have to introduce extraneous factors to account for the CI values, which are far below 1.0. To the extent that you consider the theory of DNA structure to be good, it’s only because you are prefiltering the badly predicted pixels. Consequently we should consider the theory of DNA structure to be not a very good theory

Answer #2 is:

Yes, the actual pixels are powerful evidence for the theory. The probability of the null hypothesis for the actual images (null hypothesis = random placement of pixels due to no structure) is infinitesimal–something like 0.0000000005. Therefore the alternative hypothesis, A-DNA and B-DNA, should be accepted.

The actual images do contain significant noise in but we have known mechanisms to account for the noise.

It is also possible that some other, as-yet unidentified hypothesis might be even more consistent with the actual images than the A-DNA and B-DNA hypotheses. If that as-yet unidentified hypothesis survives peer review, then we can adopt it. But until that as-yet unidentified hypothesis shows up, we accept the A-DNA and B-DNA theory with a high degree of confidence.

The biochemistry community has adopted Answer #2, not Answer #1. We should do likewise for the theory of evolution and the Klassen data, as well as for the more recent, genomic-based phylogenetic studies.

Best regards,
Chris Falter

Cornelius_Hunter · July 30, 2018, 5:18am

Let me try again. This traces back to the question about the Ewert paper not using sequence data, but rather presence/absence data. I explained that the problem with sequence data is that in order to align and compare seequences, this means the gene must be present in both species. So by definition, you are filtering out cases where the one species has the gene, but the other species lacks that gene. This is a case where you have a big difference between two species, but it is not being counted, but rather filtered out.

Your response was to say that, well, we need to have the gene present in both species in order to perform a sequence comparison. Yes, agreed, that is true. I am not disagreeing with your point, I am pointing out that you simply are reinforcing the problem which I pointed out. The data are “theory-laden.” This prefiltering removes data comparisons which are highly improbable on the theory. IOW, they do harm to the theory. They lower the probability your theory is true. (speaking in Bayesian terms here, of course).

Cornelius_Hunter · July 30, 2018, 5:24am

Well it reflects the science. I’ll boil down just one aspect of this for you:

If you take computer software as an example (as in the Ewert paper), we know two things about it:

It was designed, naturally falls into a dependency graph, and does not form a nested hierarchy.
It overwhelmingly passes the common descent type of test you evolutionists are talking about.

Hence the caution from the Klassen paper. A dataset can pass the CD test, but be designed, and not be a nested hierarchy. Now, my question for you: What does that tell you about the CD test?

T_aquaticus · July 30, 2018, 3:26pm

But how does this bias the results towards a nested hierarchy?

How so?

I could also cite a counterexample in the form of PtERV insertions in chimps and gorillas. Hundreds of insertions from this strain of retrovirus are found in the chimp and gorilla genomes, but not in the human genome. Given the accepted phylogeny of humans, chimps, and gorillas this indicates that these insertions had to occur after the split between the chimp and human lineages given the lack of PtERV insertions in the human genome. This also leads to the prediction that PtERV insertions should be found at non-orthologous positions in the chimp and gorilla genomes since they are independent insertions.

The prediction based on common ancestry is supported. All of the PtERV insertions in the chimp and gorilla genomes are found at different positions for those whose positions could be determined at single base resolution. This is a case of the common ancestry making predictions about non-orthologous genes and genetic features that aren’t shared, so I’m not sure why you think they are such a problem.

Cornelius_Hunter · July 30, 2018, 3:45pm

Can you be more specific? How so … what?

T_aquaticus · July 30, 2018, 4:03pm

Sure. I am referring to this part of your reply:

“I am pointing out that you simply are reinforcing the problem which I pointed out. The data are “theory-laden.” This prefiltering removes data comparisons which are highly improbable on the theory.”

How are gene deletions “highly improbable”? What exactly is “highly improbable”?

Also, I still don’t understand why using orthologous genes would bias the data towards a nested hierarchy. Can you explain this?

Chris_Falter · July 30, 2018, 4:28pm

I do not identify as an evolutionist, and I do not appreciate that label being applied to me.

I identify as a follower of Christ who, among other things, is serious about Bacon’s “two books” theology. I am also serious about serving the poor, serving my church, studying God’s Word, proclaiming the Kingdom of God, and loving my brothers in Christ even when we disagree about subjects of lesser importance.

I acknowledge that I do not always attain my goals. “Not that I have already obtained all this, or have already arrived at my goal, but I press on to take hold of that for which Christ Jesus took hold of me.” - Philippians 4:12

I would be very grateful if you would stop calling me an “evolutionist.” I do not know your motives, but it sounds like you’re trying to make my agreement with a well-accepted scientific theory a critical part of my identity.

Thanks,
Chris Falter

Chris_Falter · July 30, 2018, 5:17pm

Actually, no.

But a design model could be scientific if it

could make testable predictions about homologous sequence data, and
had a model for “noise”/confounding observations, and
produced one or more testable hypotheses for “noise” observations.

The theory of evolution has all three of these things. Until a design model has these three things, there is no point in calling this exercise a scientific comparison of models.

This is why I don’t understand why you are putting so much time and energy into this discussion. You are pointing out that there’s room for a better theory. Well, yes, science always has room for a better theory. Given that the “design model” makes no predictions about sequential data, and that the lack of such predictions disqualifies the “design model” from a Bayesian analysis of sequential data, I would think that the first order of business for you would be to develop such a model.

I’m a fan of baseball and like to attend games. Every now and then I hear some fellow make a remark about how the batter that just struck out is really no good; he, the fellow in the stands, could do better. Well, yes, maybe the fellow in the stands could. Start working on your baseball skills, you fellow in the stands, and try out for the team. As long as you’re in the stands just talking smack about the players on the field, though, you’ll have to excuse my doubts that you really have something better to offer than the guys who are already on the field.

This is why I appreciate Dr. Ewert’s paper. Yes, it had some serious flaws (discussed elsewhere) that made its Bayesian analysis go off-target, but it least it developed a third model and conducted a Bayesian analysis. The flaws are correctable in theory, so who knows what will happen on the next go-round?

As someone who worked for over two decades in software development, I disagree somewhat with your last contention. Software projects typically have “ancestors,” which are other projects that provide a template for getting started. Thus the dependency analysis would naturally show some evidence of nested hierarchy.

But yes, the fit between Javascript modules and dependency graph structure is better than the fit with nested hierarchy, no doubt because a lot of bespoke design goes into the typical software project. And the design model can even account for the noise of software defects. The design model is quite robust in the domain of software.

First of all, I have never heard or seen the term “CD test” used. For the purpose of this discussion, I am going to assume that it is a reference to “a statistically significant signal of nested hierarchy.”

Second, I dispute the complete absence of any nested hierarchy in the Javascript modules examined by Ewert.

Third, to the extent that a nested hierarchy signal can appear because a dependency graph incorporates a nested hierarchy as a subset, it says that there is room for a better theory to emerge that can make more accurate and robust predictions about character traits/taxa and sequential genomic data. Just as there is room for a better theory to emerge with regard to X-ray crystallography images and the structure of A-DNA and B-DNA.

Until those as-yet unidentified theories emerge, however, I suspect that the scientific community is going to continue to have confidence in the incumbent theories. I see no reason to question the scientific community’s conclusions about DNA structure or about evolution, given the absence of alternative models that predict X-ray crystallographic images and sequential genomic data, respectively.

I’ve said this how many times now? It goes back to the title of the thread, “Signal vs. Noise.” The existence of noise does not mean a theory is wrong. It just means one of two things is true:

There is room for another theory to make more accurate and robust predictions; or
The system being described is intrinsically noisy (stochastic), such that no better, parsimonious, scientific theory can be found.

I will be very interested in such an alternative theory with respect to Klassen’s data and with respect to sequential data in orthologous genes–if and when someone gets around to presenting it.

Thanks,
Chris Falter

Chris_Falter · July 31, 2018, 5:15am

Let’s take 2 eukaryotes: a conifer and a primate. Would the theory of evolution predict that they have only orthologous genes? Definitely not. Thus the presence of heterologous genes does not per se harm the theory of evolution.

At the same time, the theory of evolution makes no predictions about comparative patterns of DNA sequences within heterologous genes, as you have noted. This does leave room for some other as-yet unidentified theory that would demonstrate superiority by making parsimonious, robust predictions about DNA sequences in both orthologous genes (where evolution makes predictions) and heterologous genes (where evolution does not). In this sense and in this sense only, the presence of heterologous genes does lower the probability that the theory of evolution will never be displaced by some other theory.

Evolutionary biologists have advanced models for noise and even quantified expected noise; nevertheless, it is theoretically possible that what looks like stochastic noise (from the scientific perspective) is in fact information that can be predicted by some other as-yet unidentified theory.

Now all we need is some other theory that is capable of making parsimonious, robust predictions about DNA sequences in both orthologous and heterologous genes. Would you like to provide any links to published papers that elaborate a theory that does just that, Dr. Hunter?

Thanks,
Chris Falter

P.S. I am writing as a data scientist, not a biologist. If I have misused any terminology or misunderstood anything about the predictions made by evolution, I would welcome corrections from biologist friends.

Cornelius_Hunter · August 4, 2018, 4:27pm

Epicycles are not for free. Evolution/CD predicts a certain pattern. The science does not reveal that pattern. You can always add explanatory mechanisms to account for the evidence, but that is not a virtue. Given sufficient additional, penalty-free, explanatory mechanisms, your theory can explain anything. You can take a solar system and claim a geocentric model is a fact. They did that for two thousand years. Theory proponents tend to characterize evidence supporting their theory as normative, and contrary evidence as anomalous. Hence this thread: “Signal vs. Noise.”

If you want to expand your theory to include the additional explanatory mechanisms, then the theory is meaningless. This is what evolutionists have done. They claim the entire biological world spontaneously arose (or at least could have); that is,they claim it is a fact that strictly naturalistic processes (chance and necessity) explain the entire biological world. This goes against the science, in many different ways. One of them is classification–the species do not form the pattern predicted by evolution. But instead, evolutionists have argued that the species do not form a random pattern, and so therefore evolution is a fact. That is a fallacy. And when you point that out, they say, “but we have additional explanatory mechanisms to explain the :“noise,” and besides, evolution is a fact.”

Haywood · August 4, 2018, 5:04pm

Cornelius:
Evolution/CD predicts a certain pattern.The science does not reveal that pattern.

You keep claiming that, but despite repeated requests, I do not see your explanation of that, just reiterations of it. I think the discussion would be furthered if you simply and directly explained why you think the evidence does not support common descent (not evolution, just focus on CD).

Theory proponents tend to characterize evidence supporting their theory as normative, and contrary evidence as anomalous. Hence this thread: “Signal vs. Noise.”

I think that you are mistaken, in that additional mechanisms (such as horizontal gene transfer) are added and thoroughly studied, not merely claimed to be anomalies.

If you want to expand your theory to include the additional explanatory mechanisms, then the theory is meaningless.

Can you please provide some examples of this axiom outside of evolutionary biology?

This is what evolutionists have done. They claim the entire biological world spontaneously arose (or at least could have ); that is,they claim it is a fact that strictly naturalistic processes (chance and necessity) explain the entire biological world. This goes against the science, in many different ways. One of them is classification–the species do not form the pattern predicted by evolution. But instead, evolutionists have argued that the species do not form a random pattern, and so therefore evolution is a fact. That is a fallacy. And when you point that out, they say, “but we have additional explanatory mechanisms to explain the :“noise,” and besides, evolution is a fact.”

I also think it would be better if you refrained from lumping (“evolution/CD”) and refrained from concentrating on what people, particularly groups of people, allegedly say, instead of simply and directly explaining your very own interpretation of the evidence.

Your Brother in Christ,
Haywood

Chris_Falter · August 4, 2018, 8:15pm

With respect to DNA sequences, there is a huge difference between epicycles and the additional explanatory mechanisms of evolution such as horizontal gene transfer and incomplete lineage sorting. HGT and ILS have been empirically observed; epicycles, on the other hand, disappeared when the astronomical evidence was more closely examined.

We are all still waiting for your citation to a paper that uses a design model to predict sequential DNA observations. Do you have anything, such that a Bayesian analysis can incorporate more than evolution and the null hypothesis?

Thanks,
Chris Falter

T_aquaticus · August 6, 2018, 7:14pm

Evolution/CD also predicts noise, which you seem to ignore.

As already shown, you are making this up. The statistics have already been laid out. There is a statistically significant phylogenetic signal.

Why shouldn’t the theory include mechanisms that we observe in nature, such as incomplete lineage sorting, horizontal genetic transfer, homoplasies, and indel mutations? You seem to have lost touch with what science is.

You need to take a statistics class. Comparison to a random pattern is the basics of any statistical analysis.

Chris_Falter · August 6, 2018, 7:43pm

This is certainly the case when there is only one hypothesis/theory that makes predictions about the data being examined. In the event that a second prediction-making hypothesis could be advanced, then a comparative Bayesian analysis could be performed.

The problem is that our friend @Cornelius_Hunter is appealing to Bayesian analysis even though the only predictive models available are randomness and evolution. In this situation, the extremely significant signal of nested hierarchy strongly validates the theory of evolution.

Grace and peace,
Chris Falter

Chris_Falter · August 6, 2018, 7:50pm

Good question. Just as heliocentrism includes relativistic effects in the modeling of Mercury’s orbit, a good explanatory theory of the diversity and connectedness of life must incorporate these well-observed mechanisms.

T_aquaticus · August 6, 2018, 8:22pm

That’s only if you don’t include the rather obvious model of design where gene sequences are shared between very divergent species, as is the case with the organisms that humans genetically modify. As I mentioned in previous posts, incongruencies between closely related branches are expected. A mouse with a jellyfish green fluorescent protein not found in any other rodent is a very different thing.

Cornelius_Hunter · August 7, 2018, 4:59pm

Well this is what the data show. Calling it “noise” is merely a form of theory-preference. For examples of how the evidence compares with evolution and common descent, take a look at these posts:

Cornelius_Hunter · August 7, 2018, 5:20pm

Actually I did that. In fact, I have taught such classes. So let me explain to you that this evolutionary argument is a fallacy (false dichotomy). Consider a coin flip. Given certain reasonable assumptions, you can claim that the outcome must be either heads or tails. And so, if you can prove that the outcome is not heads, then you can reasonably conclude that it must be tails.

In science, there usually are more than merely two possible theories. So you cannot reasonably claim that falsifying one possible theory implies the other is true. It just isn’t that simple. Now in this case, evolutionists have taken this to an extreme, by making the other choice random design. So either the species are randomly designed, or else CD/evolution is true. This is a fallacy.

T_aquaticus · August 7, 2018, 5:34pm

What we can claim is that there is no other theory that fits the data better.

What other theory is there that makes predictions about the distribution of DNA sequences?