I made them up and have no bearing on the actual generation model for the dataset. Each of the examples I gave above shows how the dataset is actually generated (DAG and random sampling), and then the subjective phylogenetic tree that is derived from that dataset. To create the phylogenetic tree I use the UPGMA algorithm on the dataset.
I am happy to explain anything I state in further detail.
Pretty much. It is well known that the scientific method does not constitute a rigorous logical proof.
It is based on what is reasonable. When a procedure gives the same results a thousand times then it is reasonable to believe that under the same conditions it will give the same results the next time also.
And no matter how much you point out that this is not logically rigorous it will not change the fact that expecting a different result is unreasonable.
Yes that is the basis of scientific honesty which is vast improvement over the rhetoric of lawyers and salesmen who seek to prove their proposition any way they can make sound good.
And the result of many such tests is a bit more than just affirming the consequent based on one conditional A->B. It is more that every test from every conditional you can think of gives the same answer.
All these consequent are tested and verified to be the case, and there is not even one case where a consequent was found to be false. Then in science we think it is reasonable to conclude that A is the case, though we may say something like… to the best of our knowledge A seems to be the case. Though we keep on thinking up new tests and trying those too.
Here’s a simple counter example to alignment. Let’s say we have the four taxa:
Different letter in each position, so number of characters is 8.
You can see the taxa are created by cycling through the GATC letters, not evolution. But, we can run an alignment algorithm to create a tree of 6 steps:
In this case, the CI is 8/6 = 1 1/3, even greater than 1. We can retroactively say there are only 6 characters, due to the alignment derived, and bring the CI down to 1. In either case, we have a non evolutionary process that generated taxa with perfect, or even better than perfect, CI score using an alignment based approach.
Well, I am still working on getting PAUP software to work. On the face of things, it seems trivial to produce high CI trees from purely random sequences, since I can align the random sequences with edit distance, gap anything that doesn’t align, and then build trees from what remains. But, still gotta unravel the NEX file format to be able to test this out.
And success! I created 20 random 80 character sequences sampling uniformly over GATC. Here is the Python script.
from random import randint
strlen = 80
num_seq = 20
for i in range(num_seq):
print("".join(["GATC"[randint(0,3)] for _ in range(strlen)]))
I then performed multiple sequence alignment on them with ClustalW, which produced the following alignments. When I use the PAUP software to create trees, the trees consistently get a CI score of 0.32-0.33, which is well above the threshold in the above chart, and ventures into published CI values.
So, since I can generate phylogenetic signal CI from random datasets, this means the phylogenetic signal cannot tell us if a dataset exhibits common descent.
The phylogenetic signal argument for evolutionary common descent should be retracted until it is revisited with much more rigorous controls.
I used the PAUP software downloadable here: http://phylosolutions.com/paup-test/
I used the online ClustalW here: https://www.genome.jp/tools-bin/clustalw
You can just paste my below sequences into ClustalW and it’ll perform the alignment. Using PAUP is more complicated, since you have to mess with the NEX format. Once you do that, push the ‘Trees->Generate Trees’ menu option and push ‘OK’, and then push the ‘Trees->Describe Trees’ menu option and pick any tree and push ‘Describe’. It’ll print out something like:
For anyone wishing to reproduce my results, here is my original dataset in FASTA format.
And here is the NEX file. You can use this verbatim with PAUP to get my results.
@T_aquaticus@Chris_Falter here’s an example of sequence level analysis of the DAG using standard alignment and tree generation tools ClustalW and PAUP. I modified my previous DAG expriment script to replace the gene ID numbers with a random sequence generated from GATC of 20-30 letters. Then I process the resulting sequences with ClustalW and then PAUP to generate trees and calculate the corresponding CI.
For 10 taxa I get a CI of 0.68 and for 26 taxa I get a CI of 0.48. These are well within the values achieved with studies of actual data, see following chart, which again shows high CI values do not tell us whether the dataset is the result of common descent.
Thanks. I added an outgroup random sequence and that appears to have increased the CI score.
A number of weeks ago I stated that a DAG will generate a phylogenetic signal just as well as an evolutionary tree. You requested I conduct this analysis at the nucleotide level to demonstrate the claim, and I have done so. At this point, I believe I have made my case.
I respectfully request this forum decline to use the phylogenetic signal as an argument for evolution, at least until it is rigorously demonstrated how to eliminate DAGs as an alternate models for real world datasets.
I believe this careful approach fits well with the Biologos mission, since we want to show religion is consistent with good science, not anything cast under the heading of ‘science’ whatsoever. As such, if we happen upon a purported evidence for evolution that does not hold up under scrutiny, it is best to lay it to the side until we reach a more rigorous demonstration of the evidence’s veracity. This is important, because if we wantonly associate ourselves with any and everything called ‘science’ we may associate the Biologos brand with something that becomes publicly discredited, which will in turn cast a shade on the Biologos brand.