Signal vs. Noise, Part 2: Hunter Opens the Klassen Study Again

Chris_Falter · July 26, 2018, 3:21am

Hello Dr. Hunter,

I don’t understand this assertion. My understanding is that research studies use homologous sequences so that they can measure Levenshtein distances. Shorter distances are correlated with closer relationships, longer distances with more distant relationships. If you don’t use homologous sequences, however, then you have no way to measure distance.

As long as a sufficiently long sequence, or multiple sequences, is/are used, however, the null hypothesis is that multiple trees that emerge from the analysis of different segments should be no more similar than would occur by chance. If there is no historical basis of common descent, a tree m that is predicted only by happenstance from one segment is highly unlikely to be similar to tree n from another segment.

If all of the segments yield the same or very similar trees, however, then the concordance can become statistically significant evidence of a history of common descent.

For that matter, even if a single study yields a false positive due to a tiny sample size, the existence of other studies of different sequences for the same taxa would be quite unlikely to yield a similar tree if there were no historical basis in common descent.

Now if you could provide evidence that all of the phylogenetic studies are based on single, very short sequences, then I could understand your concern. Such an approach could indeed produce false phylogentic positives. However, I don’t think that you will be able to provide such evidence; all of the phylogenetic papers I have read from the past few years (admittedly, not that many) use multiple, long sequences in their analysis.

In addition to any citations you could provide, Dr. Hunter, I would also welcome input from others who are well-read in the literature such as @DennisVenema, @sfmatheson, @T_aquaticus, and @glipsnort,

Thanks,
Chris Falter