Now I’m confused.
There are (at least) 2 bad assumptions in this first part:
A functional protein has to be 275 amino acids long.
The proteins we see in modern organisms are the required proteins for life to exist, and no others. You ignore the possibility that proteins with very different functions could support life.
Dennis Venema over at BiLogos wrote a great essay on the ability of random DNA sequences to produce function. You might want to check it out:
The peer reviewed paper can be found here. Here is the abstract:
With a little more time on my hands, I would enjoy interacting on a couple more interesting topics you raised.
In computer science, an evolutionary algorithm has three parts:
- a stochastic “creative” function that generates new variants, and
- a deterministic selection function that identifies the fittest variants
- a propagation function that discards the least fit variants and passes the most fit to the next generation. (This is usually deterministic, but it can be stochastic.)
Either Dennett misunderstands evolutionary algorithms, or you misunderstand Dennett. I will leave the assessment to you, because I haven’t read Dennett.
Your probabilistic analysis of protein formation relies every bit as much on a multinomial distribution as the analogies you criticize.
So do I! Note: I classify my conclusion as faith rather than the ineluctable conclusion of scientific investigation.
The origin of first life problem remains unsolved by biology, but it does not intersect with the theory of evolution. The theory of evolution simply assumes that the first life came into existence by whatever means. The theory focuses on what has happened subsequent to that first event. Thus it is possible to trust the theory of evolution while at the same subscribing 100% to the miraculous creation of the first organism.
If you look at the text i quoted, it originally didnt have the clarifying text for who you were quoting. So i thought you were being unintentionally ironic.
I withdraw my complaint.
Golly, T. Just because proteins can be less than 275 AA’s doesn’t change the basal probability for those that are 275 or longer. That’s like saying that the odds of being dealt 4 aces are changed by the better odds of getting three.
And for someone regularly spouting aggressively that one’s argument must be science, your second criticism is pure speculation.
As regards Venema’s article, as far as I’m concerned, he showed extreme and sloppy ideological bias there. Let me give you an example: the Bible says, “twisting the nose produces blood” so if I twist your nose and you bleed, that proves the Bible. Seriously, his article is like that. We have no idea why random sequences poked into cell cultures 25% of the time produced improved viability, therefore evolution is true. It’s evolution-of-the-gaps! This is why many scientists need some training in philosophy.
All I’m saying is that I don’t share your faith that randomness can come up with all this. (BTW - please give me a chance to read the paper Chris linked.)
If you can win by getting 3 aces then you don’t need to get 4 aces. The same for proteins. If you can get a functional protein with fewer amino acids then the probabilities for 275 amino acids can be ignored.
I am saying that your claims are speculation. You are assuming that the proteins we see in modern life are the only proteins that can lead to life. This is speculation.
Doesn’t change the fact that 25% of random proteins increased fitness which means that those proteins are functional by definition.
I don’t need faith since I have evidence.
Hi Chris. Enjoyed the paper, but let me share some thoughts.
First, their ideological bias is evident from the first sentence. They crack me up, lumping together those four areas of science. But then the focus of the paper demonstrates with utmost clarity that they don’t have a clue what people actually object to in evolution. And that, I think, is what ideological bias does to people: it prevents them from being able to listen. People are indoctrinated these days instead of educated, and these folks have apparently never talked to someone who has thoughtful objections.
Anyway, what they have done is rigorously demonstrate microevolution. I don’t dispute microevolution, and I don’t know anyone who does. It’s the “arrival of fitness” that bugs me, not how it changes over time after it arrives. Where did the first working protein of the types studied in the paper come from?
The paper is actually a very good mathematical argument for microevolution and common descent. But the aggressive triumphalism is unfounded, since they don’t know the right question. Reminds me of “42.”
I don’t find it that complex. I think Behe’s argument does partly rest on a statement of his that the CR mutations tend to revert when Chloroquine is removed. In other words, when the selectionist pressure is removed, they are less likely to survive, natural selection starts to weed them out, and the CR problem needs to be solved again from scratch. Levin does not say if the wild types with one of the mutations were from an area where CR had arisen, and these were in the process of reverting.
But let me point out that if one or the other of the mutations is common as Levin claims, that may raise different questions as to why CR did not arise sooner. I think most fundamentally, Behe makes the case that CR has arisen at approximately the rate predicted by the math. I think that’s a very profound point! Unless he’s wrong about that, Levin’s diatribe is unfounded.
I don’t think I misread him. I’m not fond of these “just so” stories. Show me the money. And that could be very difficult, I realize, because you would need to show for quite a few examples that the five AA binding site did arise from a two. Now, could that happen? Of course! But then we’re not in the realm of science. We’re in the realm of materialistic metaphysics, where it had to arise that way cuz, well, the math is too difficult for five to show up at the same time.
Paybacks are hell? Actually glad you picked up on that, cuz it was intentional.
Well, what we’re after here is what actually happened, right? Not what can happen. The math says the first protein doing X did not arise at its current length by randomness, so we can speculate on plausible mechanisms. Then we need to demonstrate one. We would need a genetic (or protein) analysis like in the paper you posted that shows much longer proteins, perhaps in Eukaryotes, doing the same basic task as shorter proteins somewhere, perhaps in Archaeans, and showing some plausible pathway from the Archaean protein to the Eukaryote. Demonstrate this for a number of proteins, and that would be something I would need to consider.
Regarding “each step”, while probabilities of individual mutations do not multiply (that’s microevolution again), probabilities of finding cooperating proteins with working binding sites in a protein complex do. Now again, there are values to add to the numerator even at this level. But it seems to me the numbers of zeroes in the denominator grow way too quickly.
This one, Chris, makes me giggle. Fact is, I am only different from genetic material of my parents by about 1e-10 (if I remember my mutation rates). The argument that I am of infinitesimal probability is (if I’ve got my logical fallacies right) a post hoc fallacy. It is also a tautology, because the odds of this keyboard I am typing on being made from these particular keys is also infinitesimal. By that argument, probability is kinda meaningless, yet we know probability works.
But to me, that’s not the right analogy. Instead, take a billion scrabble tiles and dump them on the floor. Find any meaningful sentence of, maybe, 150 letters. I think this is a great analogy to coming up with the first protein that does X. Even as you add orders of magnitude to the number of tiles you’re working with, well, good luck finding even one.
Ahhhh, thanks Chris! What a guy!
Let me add, I appreciate how you are trying to exchange views thoughtfully. Glad to try continue as we have time.
To help focus on perhaps the most important point, if Behe’s argument is wrong, why does his math align pretty well with how frequently CR has arisen ? If Levin is right, why has CR not arisen several orders of magnitude more frequently (and so, consequently, chloroquine would never have gained the traction it did as a treatment)?
Hope this is all helpful!
Then tell us how you think these proteins came about, and produce the same type of evidence that you require from other explanations for your own explanation. Remember, no “just so” stories or speculation. Let’s see the evidence.
With all due respect, you did not fully grasp the paper’s methods and conclusions. Take another look at Table 2, p. 6. You will see that the analysis by White, et al., establishes common ancestry among the following groups:
- vertebrates, sea squirts (urochordata), echinoderms (e.g., sea urchins), and hemichordates (e.g., acorn worms): common ancestor ~600MYA
- chlorophytes (members of protist kingdom) and streptophytes (land plants): common ancestor ~700MYA
- ferns and seed plants: common ancestor ~390MYA
The way I have seen advocates of microevolution define the term is that it admits some amount of evolution (e.g., antibiotic resistance) while denying common ancestry beyond narrowly constrained boundaries, such as species or perhaps genera. But this paper demonstrates common ancestry that spans beyond family…beyond order…beyond class… and beyond phyla (e.g., vertebrate humans alongside acorn worms). And even beyond the very highest taxonomic category, kingdoms (protist chlorophytes alongside plant streptophytes).
The only way I see to accept White’s results and “microevolution” simultaneously is to redefine microevolution so it’s basically the same thing as macroevolution.
This is a really good question, Marty. It also has nothing to do with the theory of evolution.
Let me repeat that, Marty, because you seem to be confusing categories within the study of biology. The theory of evolution helps us understand how life, once it existed on earth, came to have the many forms and processes that we observe today.
It is akin to the Big Bang theory in astronomy in that it assumes a starting condition and explains what happens thereafter. The starting condition in the Big Bang theory is the initial singularity. The Big Bang theory does not try to explain that starting condition; it assumes the starting condition and works from there. Likewise with the theory of evolution: it assumes a starting condition (a primitive form of life billion years ago) and works from there.
Your concern has everything to do with the branch of biology known as origin-of-life studies.
Your concern has nothing to do with the theory of evolution.
You seem to be justifying the violation of the standards you hold other people to. Is that really where you want to go?
This is not the right analogy for working biologists, because they see strong evidence for an inheritance process that, to adopt the language metaphor, allows simple words to form, then longer words, then word combinations, then sentences, then paragraphs.
I don’t understand how you got on this track, Marty. Neither of them disagree as to the frequency of CR appearance. Their disagreement centers on whether a probabilistic analysis of mutation frequency and other genetic mechanisms supports an evolutionary explanatory model (Levin’s position) or whether it places CR appearance beyond the edge of evolution (Behe’s position).
The key differentiator is that Behe’s position assumes the the mutation events are I.I.D., whereas Levin’s incorporates survival analysis that dramatically increases the probability of co-occurrence of CR-aiding mutations. To use the Scrabble tile analogy, Behe assumes the sentence has to come together all at once. Levin, on the other hand, proposes that individual words and n-grams can form along the pathway to the sentence.
Levin provides strong evidence for his position by showing that we see the words and n-grams in the wild. The sentence did not appear out of nowhere.
EDIT: I think I misunderstood something that you wrote, Marty, so I want to try as much as possible to clarify. While you are genuinely skeptical about the claim that anything short of an intervention by an intelligent designer could have created the first cell, you are also skeptical of the claim that the mechanisms described by the theory of evolution could account for the appearance of complex proteins later in history. And that was what you were trying to communicate. Do I have that right?
Let’s see what phylostratigraphy can turn up. It might take a little while, though.
Dipping my feet into the waters quite tentatively here, because I can’t pretend to have followed everything so far. But I am fairly sure, as I understand Marty’s position, that he doesn’t actually deny common ancestry, necessarily. He’s just not sure that random mutations and natural selection will get us to macroevolution. I myself have held this position in the past, so I’m sympathetic to it. The idea is, yes of course, small changes add up to big ones, and yes of course, the tree of life by common ancestry is fairly hard to deny once you get into the details, but the only remaining question really is, did God himself introduce those little changes because random mechanisms couldn’t really have done the heavy lifting, or did they really come about through apparently scientifically random processes?
Marty, correct me if I’m wrong.
My best to all,
It’s always a good morning (or afternoon or evening) when you post.
That’s not how I would summarize the theory of evolution. It has a strong non-random component (natural selection) for one thing. And stochastic is a far better term than random to describe changes in DNA and environment.
I mention these things because the word random is theological red meat to the dedicated Calvinist. Why get drawn into a theological maelstrom when you can avoid it with careful terminology?
We have observed the natural process of mutagenesis producing all the types of differences that we see between the genomes of different species. This includes indels, substitutions, insertion of mobile elements, and small to large recombination events. The real question is why someone would say that any difference between two genomes could not be produced by the known mechanisms of mutagenesis, or why these changes would not accumulate with each generation.
Protein length/size and evolution is fairly well covered in the literature. A “plausible pathway” across a billion years isn’t necessary to see interesting trends across phylogeny in protein structure, including size. A couple of papers below, to illustrate the kinds of analyses that have been done.
Thanks so much for these excellent articles, Steve! They provide a great overview of the kinds of selection pressures on proteins in different kingdoms, and how archaic proteins differed from modern proteins. Great stuff.
I think that Marty is looking for a study of a protein that appears in many higher taxa, and whose evolutionary history can be reasonably inferred. I understand that there is not enough information to identify the exact sequence in which changes occurred. However, if some kind of phylogeny across a broad set of taxa could be created based on protein structure, that would be extremely helpful. I don’t know if any such studies have been done, or could even be done given our current data and tools.
Look for papers on phylostratigraphy. That will launch you (or Marty) pretty well, I think.
Thanks so much, Stephen!
“Phylostratigraphy” led me to “de novo”, where phylostratigraphy is one of the 2 main approaches for identifying novel genes. The other approach is to identify a transcribed gene under selection pressure that has synteny with near-equivalent, non-coding region of multiple other species. Instances identified by the second approach provide very strong evidence of relatively recent mutation(s) that activate the non-coding region (it is now transcribed, when before it was not) and yield functionality (demonstrated by the transcription and selection pressure).
I hope I explained that correctly. I would be happy to have a well-trained biologist clarify anything that needs clarification!
Anyway, here are some de novo genes that code for proteins:
- CLLU1 (121 codons), C22orf45 (159 codons), and DNAH10OS (163 codons) - Knowles and McLysaght, 2009
- YHR180W-A (61 codons) and YGL165C (173 codons) - Wu and Knudson, 2018. Note that these resulted from DNA shuffling of two widely separated non-coding regions.
- Lu, Leu, and Lin (2017) identified 10 de novo protein-coding genes under selection pressure in yeast (see Table 1).
Food for thought.
Hi @AMWolfe. Yes that’s a good summary.
We’re getting there! And I really appreciate the continued engagement.
Yes, but mutation is random, within the laws of physics and appropriate distributions. Without mutations, nothing new arises on which Selection can work. So it seems to me that this foundation of development is fundamentally random.
Given the known selectionist mechanisms which are not properly “random”, functional sequences must still arise to be selected. Can Mutation, which is fundamentally random, provide enough information to produce the complexity we see in the biosphere?
I’ll be checking those papers out. Thanks!
But I would still like to get a good answer to these questions:
Hope all is going well for you today!
I guess I didn’t explain my analysis clearly enough. I’ll try again.
Here’s a summary of Behe’s argument, per my consumption of his presentations in books, videos, and articles:
- The empirically observed frequency of mutations contributing to chloroquine resistance in Falciparum is 10-20 per cell.
- An ensemble of 5 such mutations have been observed in the genome of chloroquine resistant Falciparum, a single-cell organism.
- The occurrence of those mutations is I.I.D.
- Therefore the probability of the co-occurrence of those 5 mutations is the multiplicative product of a single occurrence–i.e., 10-100
- 10-100 is beyond the edge of evolution with respect to Falciparum in the time window of our data.
- Phenomena that cannot be explained by scientific theory–i.e., are beyond the edge of scientific explanation–are the result of intelligent design.
- Therefore we must consider the co-occurrence of the mutations to be the result of intelligent design not influenced by any factors known to the theory of evolution.
You seem to be arguing that because Behe was correct about point #1, he must also be right when he gets to point #7. But #7 does not follow from #1 without first passing through #2 - #6. Does that make sense?
Also, point #6 seems ripe for argument. However, I have not addressed it in this thread, nor do I need to. The argument fails before it gets that far, as I will explain again.
While some critics may have argued that Behe’s empirical observation (point #1) is incorrectly derived, Levin does not make that argument. He argues instead that Behe’s probability estimation of the co-occurrence of 5 mutations (point #4) is fundamentally flawed because Behe’s assumption of I.I.D. (point #3) is both illogical and contrary to the evidence.
Let’s examine those twin aspects in more detail.
1. Behe’s assumption of I.I.D. is illogical. To the extent any of these mutations provide some benefit to a Falciparum population, the mutation would tend to persist in the population’s genome. Thus the 5 mutations would not be I.I.D.–the forces of natural selection would strongly influence the survival of any of the mutations in the genome once it occurs.
2. Behe’s assumption of I.I.D. is contrary to the evidence. Levin cites the observation of individual mutations in Falciparum sub-populations as strong evidence that any individual CR-related mutation tends to survive in the Falciparum genome. At the risk of redundancy, I again point out that persistence of a mutation removes the assumption of I.I.D. with respect to the ensemble of mutations.
Once the assumption of I.I.D. is correctly removed, the probability of the co-occurrence of an ensemble of 5 CR mutations is well inside “the edge of evolution” with respect to Falciparum in the time window of our data.
I hope that this is sufficiently clear to help our conversation keep moving forward. If it is not, please do not hesitate to work with me to improve it.
I need to kick this out the door cuz I’ve spent too many hours on it already. Hoping it’s clear and it helps!
Your big beef with Behe seems to be around I.I.D, that Behe assumes it, and that it is contrary to the evidence. But it seems to me Behe goes to great length to demonstrate that, within an order of approximation, the mutations leading to Chloroquine Resistance (CR) are I.I.D, though he doesn’t use that term. If that was not the case, CR should have arisen more frequently. Don’t miss that essential point: if Levin is right, CR should have arisen more frequently. Thus the individual mutations must not be neutral or favored, but at some level, however slight, disfavored.
I think you mentioned that you have not read this book, and we should not focus on what Behe does not say. So as part of this post, I’m going to briefly summarize Behe in Edge as I see it, and comment as it goes. All quotes are from Edge.
“The results of modern DNA sequencing experiments, undreamed of by nineteenth-century scientists like Charles Darwin, show that some distantly related organisms share apparently arbitrary features of their genes that seem to have no explanation other than that they were inherited from a distant common ancestor. Second, there’s also great evidence that random mutation paired with natural selection can modify life in important ways. Third, however, there is strong evidence that random mutation is extremely limited.” Kindle Locations 83-86
I bolded that last sentence because that’s what the book is mostly about.
I would characterize his first points in the book as these:
The mutation rate at each needed location for CR is independently e-10. If those mutations are independently less viable than the original, the calculated frequency of the pair of mutations contributing to chloroquine resistance is e-20 per Falciparum. Predictions for other drugs requiring a single mutation are still e-10.
The actual appearance of CR is, to a best approximation, within an order of magnitude of e-20. For drugs which targeted a single mutation, resistance arose very quickly.
Therefore the approach of using a mathematical model to calculate the likelihood of a set of changes of this type has some validity whenever you’re really doing a random search (which is the case when the individual mutations are not preferred). Falciparum provides an excellent case study.
So returning to I.I.D., point three here is arguing that since the two mutation scenario came out close to the math, the I.I.D. of the mutations is therefore reasonable, plus-minus. As always, the real world does not match I.I.D. exactly. But it’s within an order of magnitude, and given numbers like e-10, then e-1 is almost even. To avoid CR mutations being I.I.D, each required mutation needs to be neutral or better (it cannot reduce the odds of survival). Behe acknowledges that and points out that in the absence of Chloroquine these mutations appear to diminsh, implying again that those mutations are, most likely, at least slightly unfavorable independently and in the absence of Chloroquine.
Then he applies this mathematical argument to protein binding sites because:
“As former president of the National Academy of Sciences Bruce Alberts remarked : … the chemistry that makes life possible is much more elaborate and sophisticated than anything we students had ever considered … . [I]nstead of a cell dominated by randomly colliding individual protein molecules, we now know that nearly every major process in a cell is carried out by assemblies of 10 or more protein molecules. And, as it carries out its biological functions, each of these protein assemblies interacts with several other large complexes of proteins. Indeed, the entire cell can be viewed as a factory that contains an elaborate network of interlocking assembly lines, each of which is composed of a set of large protein machines.” Kindle Locations 1991-1998
He doesn’t bother taking on “the arrival of fitness” in proteins themselves (the math for which I sketched above). He focuses on the binding sites for the complexes, which are much smaller, typically 4 or 5 significant Amino Acids (AAs). All these protein binding sites need to arise “by accident” to produce these protein complexes.
Someone might now argue “but binding could have started with only a couple of AAs and evolved better fitness over time.” Well, let’s look at that deeper. If you start with two AAs, that binding space is only 400 “binding shapes”, and there are way too many proteins and complexes for two AAs to be the norm. And as soon as you go to three, the probability of finding it by chance gets really low, and until it binds and produces a meaningful function evolution will not refine it. Also you need the “opposite” or “matching” binding site shape on the protein it binds to, so immediately you’ve got six AAs that need to be right. And thousands of times this needs to happen with a minimum of binding site AAs per protein at the right place in the protein, and the binding site needs to connect with the right other proteins that allow the complex to form in the right shape for the overall function of that unit, and that complex needs to fit into the assembly line of functions with other complexes. Remember, until you have a working binding site at all, it is probably just another random polypeptide.
The above paragraph certainly shows the difficulty in origins of life. But it also shows the problem of trying by random mutation to come up with binding sites for any de novo protein/complex/pathway. So I’m skeptical that that happened at the level it would need to happen. The point for me is that the more I dig into this, the more the probability dwindles away until it becomes, to me, absurd to “believe in it.”
Since improbabilities multiply, the total number of cells that have ever lived don’t search enough protein space by random search to produce a single complete set of binding sites for one protein complex. Running the math, Behe concludes:
“In short, complexes of just three or more different proteins are beyond the edge of evolution. They are lost in shape space . And the great majority of proteins in the cell work in complexes of six or more. Far beyond that edge.” Kindle Locations 2159-2161
Also FWIW: “Workers at the University of Georgia estimate that e+30 single - celled organisms are produced every year ; over the billion - year - plus history of the earth , the total number of cells that have existed may be close to e+40.” Kindle Locations 2426-2427
Another perspective (not in Edge) would be to calculate using only all the multi-cellular critters that have ever lived. A good discussion of the number is here https://www.quora.com/How-many-animals-have-ever-lived and the highest estimates come out around e+29. Behe mentions this somewhere in the book, but I can’t find it. I agree with him that there just weren’t enough “generations times population size” of animals to accidentally (that is, by mutation randomly searching for useful things) to produce all the binding sites for all the protein complexes of all the multi-cellular species that have existed. This is to say nothing of coming up with the proteins themselves that work together in the complexes!
Moving on to this point, Chris:
I don’t see that as Behe’s argument in Edge. It’s just that the processes we know about are not sufficient to produce the complexity we see. As a first order argument, Behe cannot see how the processes we know about are adequate. There must be something more. We can then talk about potential solutions, and everyone has faith regarding what that “something more” might be.
Then, for those of us who have also recognized the more likely divine intervention in the creation and fine tuning of the universe for life, the creation of first life, and the exceptionalism of humans, recognizing divine intervention in evolution is not difficult philosophically (it is more invasive, in that it requires regular intervention, but that is how he works in our lives today). I think the math provides compelling argument that evolution so desparately needed help. Every answer as to how all that "help" got provided is, at this time, a faith position.
“Could” Scientists yet discover a way the odds were massively improved? A lot of people have faith in that! I don’t believe in it. That’s just where I am on this. You’d need, not just some example of one thing that happened to work, but at least a couple of systematic principles that brought the odds back astronomically in multiple places. Right now the odds of mutation doing all this on its own are IMO truly unimaginable. So I think evolution needed help.
Marty, while the math is beyond me, I appreciate your well thought out response.
While I appreciate the application of this example of chloroquine resistance to the greater evolutionary process, and accept the complexity and difficulty in evolutionary processes overall, if you say that chloroquine resistance is an example of intelligent design, does that then mean that the intelligent designer (God) is actively and currently designing bacteria better able to kill and cause disease and suffering? How do we handle that philosophically and theologically? Personally, I am more comfortable in accepting our ignorance.