A common objection to a god-of-the gaps argument is of course that if we invoke a god to explain what we currently can’t explain, we run the risk of a shrinking or even disappearing god as scientific knowledge advances (e.g. the ‘BioLogos article Are gaps in scientific knowledge evidence for God?’). However, gaps in our scientific knowledge are not the same as gaps in scientific explanation; and it seems to me that in biology some explanatory gaps get larger as we discover more.
For example, by the 1950s-60s we knew the amino acid sequence of several proteins and realised that given the number of possible amino acid sequences for a protein of typical length, the probability of finding the right amino acid sequence (if it’s specific) is prohibitively low (e.g. Wistar Symposium, 1966).
By the 1970s it was known that proteins fold into a specific 3D conformation (or a few) that is determined by and dependent on their amino acid sequence, and is essential for their function. This has some significant implications for an evolutionary origin of proteins:
-
Because a typical polypeptide must have a minimum length of about 70 amino acids in order to fold, it means that proteins (especially enzymes) could not have started out as short polypeptides (even though this is still widely believed and often present in textbooks).
-
Because for folding to work the amino acids in the middle of the protein need to fit together closely (like a 3D jigsaw) while still connected by the polypeptide backbone, it explains why at least major parts of the amino acid sequence must be fairly specific.
-
As we started to learn about how enzymes function we realised that they have active sites composed of specific amino acids which must in the right places in the linear sequence so that they are positioned correctly in relation to each other in the folded protein, which also constrains a protein’s amino acid sequence.
During this time we also began to learn about the structure of genes and how they work. At the very least, for a functioning protein, not only must there be a sequence of DNA that codes for a useful protein, but it must be downstream of a control sequence which will e.g. bind the enzyme that transcribes the DNA into RNA. So, from an evolutionary perspective, not only must a potentially-useful protein-coding sequence arise, but a viable control region must also arise at more-or-less the same time and place, which clearly compounds the odds against it occurring. And we have learned that most genes are much more complex than this basic arrangement e.g. with multiple regulatory sequences, in various positions in relation to the protein-coding sequence.
We have also learned that most proteins (whether structural or enzymes) do not function in isolation but necessarily in conjunction with others, often as components of a protein complex e.g. ATP synthase which has at least 9 types of protein components, some in multiple copies. For every situation where multiple proteins are essential for a function (and the component proteins have no independent function) for each and every component protein, the protein-coding sequence (and control sequences, although these can be shared) must arise independently and yet arise or come together at the same time and place. Unfortunately, this compounded improbability facing an evolutionary origin of proteins is rarely acknowledged.
Some may wonder whether natural selection could help. But the straightforward answer is ‘no’, because natural selection operates on differential fitness, but most of a protein’s sequence needs to be right before it has any function / fitness. (Dawkins’ METHINKSITISLIKEAWEASEL may illustrate the principle of cumulative selection, but it does not illustrate natural selection. Specifically, it is totally inapplicable to the evolution of proteins (although I think he intended it to be) because getting small parts of an amino acid sequence correct (a few letters in his sequence) will not give you a protein with any function.)
In summary, since recognising the prima facie improbability of proteins in the 1960s and that it is a challenge to an evolutionary origin, the more we have learned about proteins, genes etc., rather than explaining how they might have evolved despite this improbability, it has exposed an even greater challenge. That is, although gaps in our knowledge have decreased, this explanatory gap has increased.
How might future scientific discoveries overcome the challenges to an evolutionary explanation for the origin of proteins? There are 2 possibilities where further research might conceivably produce knowledge that might overcome the prima facie case against an evolutionary origin of proteins:
-
We know that some variation in amino acid sequence is permissible. What if (we found that) the number of possible variations is so high that it constitutes a significant proportion of the total possible sequences (for a given length of amino acids) such that there is a realistic chance that a random search will find one that works? This seems unlikely given the constraints on a protein’s amino acid sequence, but …
-
What if we found some basis that could guide an evolutionary search through sequence space so that a search does not need to be random? It seems hard to envisage how this could work without at least something akin to fitness on which natural selection could operate, but …
It seems to me that both of these are not only speculative, but from what we do know there is a prima facie against them. And if so, then what is the basis for thinking that proteins have arisen in an evolutionary way?
Although I introduced this post in the context of god-of-the-gaps, my primary interest in raising this issue is not to promote an argument for God but to reopen what I think is a fundamental flaw in the theory of evolution, and to question the presumption that explanatory gaps will always be closed, or even narrowed, by scientific discoveries. Sorry it’s so long an introduction, but hopefully it’ll prompt a useful discussion.