De novo evolution of Nylonase?

Thanks for the links… I had read through that study previously, but recognize I’m not adept enough to understand all the details.

I did a quick glance, and wont have time to study in detail for a bit… but the only discussion of “frameshift” that I could find was their acknowledgement of the original 1984 frameshift “hypothesis” of Ohno. (As referenced in footnote 45).

If you have time and inclination, could you point me to the specific paragraph or sentence that affirms the frameshift mutation and/or de novo appearance of the entire enzyme family being discussed? I’m not qualified enough on reading biological items of this detail to fully know what I’m reading.

Dr. Gauger stated as much in her article on this topic…

If the nylonase enzyme did evolve from a frameshifted protein, it would genuinely be a demonstration that new proteins are easy to evolve. It would be proof positive that intelligent design advocates are wrong, that it’s not hard to get a new protein from random sequence.

From the discussion in the JBC paper linked in my previous post:

We have previously proposed that the nylon oligomer hydrolase (EII) evolved by gene duplication from the common antecedent of EII and cryptic EII′ proteins located on the same plasmid (8). However, the following two hypotheses have been proposed. (i) The EII enzyme is specified by an alternative open reading frame from a preexisting coding sequence that originally specified a 472-residue-long Arg-rich protein and a frameshift mutation in the ancestral gene, creating a gene responsible for nylon oligomer hydrolysis (45). (ii) There is a special mechanism for protecting a nonstop frame, namely a long stretch of sequence without chain-terminating base triplets, from mutations that generate the stop codons on the antisense strand, and such a mechanism enables the nonstop frame to evolve into a new functional gene (46).

So, I can’t spend a lot more time on this today but I am rediscovering the nylonase story and I think it’s pretty complicated. The JBC paper I have been discussing, linked above, argues persuasively, as near as I can tell, that the new enzyme arose by conventional modes of duplication and mutation. The authors of that paper describe the sequences that explain (for the most part) the new activity (breakdown of nylon). And they note that the new enzyme has the same overall structure (fold) as the ancestral enzyme. A straightforward frameshift, to a new protein sequence, is not the likely explanation. If that’s what Ann Gauger wrote, then she’s right.

In my post last year, I noted that the original idea by Ohno was not really a frameshift at all, but a very interesting phenomenon called overprinting. Linked in post above.

However, if ID needs de novo gene birth to be a fiction, then ID is dead. Nylonase may not be a good example (or a valid example at all), but new protein sequences are known to be birthed from non-coding sequence, and de novo gene birth is probably more common and likely than we once thought. I’ve written semi-recently about a new gene (protein) coming about by a frameshift, and there are whole excellent review articles on de novo gene birth. Links below.


When I tried digging into the nylonase story I found it highly confusing and eventually gave up.

1 Like

I also have tried a few times, and I hit the same morass. I agree that it’s probably not a frameshift, but it gets muddled pretty quickly. When I wrote Adam and the Genome I was relying on the usual interpretation, which is now shown to be probably wrong. It’s almost a guarantee that a book on science will have something in it that is later shown to have a better alternate explanation, and it looks like nylonase is my example. Look for a “VENEMA ADMITS HE IS WRONG” headline coming soon to your favourite source of ID news. Though the irony here is of course that it’s just the usual new functional information arising through duplication and mutation, nothing to see here, folks…

If Adam and the Genome ever goes to a second edition I will replace the now-dubious nylonase example with another example where we have better evidence. There are many to choose from, but I’d probably use the yeast BSC4 example at present. The evidence there is very good, and not easily dismissed. Any de novo protein is a problem for Axe and ID.


Yeah but it’s still odd; the overprinting thing seems right (and interesting) and the weird conserved no-stop sequence on the antisense strand is also odd. Not a frameshift, but something strange going on.

Ask and ye shall receive. Here’s a nice quote from the end of the Intro:

Gadid AFGP presumably evolved very recently, in response to the cyclic northern hemisphere glaciation that commenced in the late Pliocene about 3 Mya. We reason it is unlikely that mutational processes could completely obscure even noncoding sequences within such a short evolutionary time such that the extant form of the AFGP nongenic ancestor should remain identifiable. We, therefore, decided to track the AFGP genotype and its homologs within the gadid phylogeny to pinpoint the ancestral DNA site of origin and reconstruct the gadid AFGP evolutionary path. Here, we report the identification of the noncoding founder sequence and the mechanism by which it gave rise to a new functional gadid AFGP gene. Our results also show that the gadid AFGP evolutionary process likely represents a rare example of the proto-ORF model of de novo gene birth (6, 17) where the noncoding founder ORF existed well before the novel gene arose.

1 Like


Firstly, thank you for the link to your blog. Now I have more to enjoy reading.

Secondly, you are an extraordinarily talented writer. I laughed out loud at the broccoli GAG and the Dilbert GAA…! And helped explain some relatively complicated items.

Thirdly, I very much appreciate you indulging my insatiable curiosity, but I realize we’re all busy. (I’m already spending far more time researching this than I should, amateur that I am.). Please don’t feel compelled to answer my inquiries right away, if you could indulge a few more points of interest, I’d be most appreciative, but I’m ot in a rush. If it is after the 3 day limit I can post again if the discussion board gods permit, or continue in a private discussion. Please take your time… I love learning and being corrected where I’m misunderstanding, but I’m in no rush

Fourthly, I’ll have to beg your patience… I began my Biology/chemistry majors some 25 years ago before I dropped them for my Religious Philosophy concentration, so it is taking me some time to work through much of the specifics, even if the general concepts are pretty familiar to me.

Fifthly, I’m going to work through some of the various papers I found that argue for de novo gene formation, and would be interested in any others you can send my way… my willingness to learn is insatiable, so you can’t give me too much.



As an ID sympathizer, one clarification if I may be so bold…

Any de novo protein confirmed as arising from essentially random sequences is a problem for ID. That is why I found the nylonase case so interesting… the source of the code for that protein would have indisputably been the gobbledygook coming out of a frameshift, yet even an essentially random sequence like that could give rise to a very functional protein.

de novo proteins, in and of themselves, are essentially predicted by ID theory… the idea being that small tinkering is insufficient to produce certain radical new functions for an organism, thus the designer would have to insert entirely new code in order to produce the new function.

It is assumed by ID, if I understand it rightly, that the intelligent designer has the capacity to design de novo proteins and insert the code for them into the genome.

1 Like

Also, in hopes I’m not coming across too fastidious…

But I’m not following what the the irony is to which you are referring here?

Interesting take, and not quite accurate (if I’m understanding you correctly). ID would be threatened by any de novo protein arising through natural means, whether or not the sequence was “essentially random”. So, nonrandom, noncoding sequences giving rise to de novo coding sequences that produce folded, functional proteins = problem for Axe et al. We have much evidence to support that this does indeed happen, and reasonably frequently, even if nylonase isn’t an example after all.

ID folks like Meyer say that new functional information can only arise by the action of an intelligent agent, and here they have to shoot down the nylonase story by showing evidence that the information used to provide this new function (degrading nylon) came about through natural processes (duplication and mutation of an existing gene).


Hi Daniel,

I greatly appreciate that you’re asking a lot of good questions and carefully weighing evidence. Well done, friend!

I think it’s important to be careful as we discuss and think about hypotheses, predictions and confirmations. To give an example: Suppose my scientific theory is that every human infant comes into existence through the union of a sperm and an ovum followed by implantation in a woman’s uterus, etc.

Now suppose I find a 3 month old infant crying alongside a street. There is no sign of parents, and passersby do not know how the child got there. Is that evidence against the scientific theory, in favor of a miraculous birth?

If this occurred in village that had recently been attacked by terrorists, then retaken an hour ago by government forces after pitched battle, you would probably think that the child was not born miraculously. Instead, the child had been put in the care of a relative who died, or the parents had died, or some other such tragedy.

In fact, you would probably expect many such orphans to be found in such a village. The carnage of history would have erased the specific origin of many orphans, although you might hope with careful investigation to learn more about the history of at least some of them.

The same may apply to de novo genes. The fact that scientists cannot necessarily identify the evolutionary path for all of them need not indicate that the ID hypothesis is to be preferred; it could simply be that scientists do not have enough information for that particular case to make a reasonable statement about the maximum likelihood estimation.

Note: I am not a biologist! I am speaking here from general knowledge about data and scientific procedures. If any biologists like @sfmatheson, @glipsnort, or @DennisVenema would like to correct or augment what I have written, please step in!

Chris Falter

1 Like


I can’t speak for Axe et al, or their particular methods, but if interesting, this ID sympathizer wouldn’t be particularly troubled by the non random sequences you mention. The particular complex information in that case is already extant, simply untranscribed.

If I understand the scenario properly, the protein may have arisen “de novo,” but the information to produce said protein was already in the database, if this is what you mean by nonrandom. This is categorically different to me than a protein and its genetic formula literally arising from nowhere or from a truly random sequence in the gene.

In other words, they would be troubled I imagine if it could be demonstrated that the information in the genome that gave rise to the de novo protein had arisen “through natural (I.e., unguided) means. If _that_could be demonstrated, that would indeed prove detrimental.

I’ve done enough work in computer programming to consider an analogy… I’ve played computer games that have “secret hidden areas” that you cannot generally access by playing the game as finished, as the main program will never call upon those subroutines. You need to beak some cheat code or find some glitch to access them, but they were there standing by, sometimes in anticipation of a future addition to the game. If some small bug or fluke lets me access those areas, I don’t ascribe the newly arising but previously unexecuted code as having arisen “de novo” I’m the sense being used here, as theynsprang literally from nowhere. This may or may not be exactly analogous to our newly coding gene, but I mention it simply to demonstrate the basic principle, that simply because the “end product” is new, it does not necessarily follow that the information to construct the said end product must also be “new.”

In short, though, I concur that if we could demonstrate that the information in the code arose randomly, or by unguided natural causes, then yes, that would seriously undermine the basic ID assumptions.

I completely agree with you here… and apologies if I made it sound otherwise. this particular data point gives no preference to the ID hypothesis. But my only point being that their existence does not somehow rule out or otherwise disprove ID hypothesis.

I simply observe that the idea of a designer designing de novo, novel genes and subsequent proteins is most certainly consistent with and predicted by, the ID hypothesis. In and of itself, it cannot proof against ID. I agree though it is also entirely consistent with the larger Evolutionary theory as you describe.

Unlike computer code, however, genetic information doesn’t stick around long (in evolutionary terms) if it isn’t used – it starts accumulating random changes. Front-loading doesn’t work in biology.


Quite correct, as I understand it, so to the degree that randomness is introduced before the expresssion of said gene, to the degree random changes are introduced, then to that degree, the new de novo protein would contain that degree of randomness.

Depejding on rhe amovny of oruginal dxta remkining, thopgh, it stillomay conleivably accozplish its tntended purpocse.

But I guess I am also curious why taking code from a part of the genome that is believed not to code for proteins, but then (accidentally?) transcribed, would not still be considered “essentially random.”

If I took the bits and bytes of data that encode say an audio recording, and “translated” them into their ASCII/text equivalent, I would get something like h^ŒxJ$£ç]dl7§’¢6>hR’6fSu, which, for my purposes, is essentially random. How would it not be the case for ”transcribing” a section of the DNA that was never intended or used to code proteins? I’m not sure why we would not call that “essentially random”?

In his book “Only a Theory” Kenneth Miller discusses Nylonase and manages to shoot himself in the foot.
He said in the book that experiments have shown that bacteria will consistently and quickly (in a matter of weeks) develop the ability to process nylon in the right conditions. This shows that it is not particularly difficult for adaptation to achieve and rules out a de novo appearance of a new gene; which should be rare and have on average a long waiting time.

Since then it has been found that many bacteria have enzymes with some limited effect on nylon and it is well within the edge of evolution for natural selection to fine tune these to have much greater specificity.

Mutation + Natural selection is a good tinkerer but a lousy innovator.

By the way, this also applies for the Cit+ trait in Lenski’s LTEE. It has now been shown that this trait will also develop consistently and quickly; a matter of months; when the experimental conditions favour it. It took 15 years in the LTEE because the conditions were only moderately favourable for its appearance.

I concur - Examples like that of true evolution, which are observable, repeatable, testable, I certainly believe wholeheartedly.embrace.

But yes, if some feature like that is set up in the gene sequence, wherein give it the right conditions and the outcome is all but guaranteed, must mean it isn’t particularly difficult or improbable to achieve the said endstate.

1 Like


Respectfully, I think their argument is rather more nuanced. I recall Meyer’s frequent use of the qualified statement “significant amounts” regarding new functional information. I searched through “Darwin’s Doubt” and found 5 times he used the specific, qualified phrase “significant amounts” in relation to genetic information.

So, to be fair, let’s agree that he isn’t so unsophisticated as to make an argument that no information whatsoever can arise by random chance to be selected by nature.

And philosophically speaking, this seems natural, common sense, and almost self-evident to me. Of course an information system, already functional, may take on a new function by the adjustment, insertion, or deletion of small amounts of information that makes small or minor changes to the information already extant. But this is a far cry from suggesting the entire system could arise de novo or by some kind of radical change. For instance…

I recall reading the following “church bulletin blooper” when I was younger, it went something like:

  1. “The rose on the altar is for the birth of David Jones, the sin of Mr. and Mrs. Robert Jones.”

Clearly this sentence has taken on a “new function” of sorts (in this case, a comical - or terribly judgmental! - function) from the original intended, whose function was simply a statement of fact or observation:

  1. “The rose on the altar is for the birth of David Jones, the son of Mr. and Mrs. Robert Jones.”

But the “new information” needed was simply the substitution of an “i” for an “o”, something clearly not terribly unlikely and not particularly insurmountable.

I could plug this into a random letter generator, and let it mutate individual letters, and check any new words against the dictionary to find such “functional sequences”, and very conceivably get other new sentences that also had “new functions” from “new information”.

I might get one with a more macabre function:

  1. “The nose on the altar is from the birth of David Jones, the son of Mr. and Mrs. Robert Jones.”

Or one making a statement about gay marriage…

  1. ”The rose on the altar is for the birth of David Jones, the son of Mr. and Mr. Robert Jones.”

Or one with a certain amount of poetry:

  1. ”The rose on the altar is for the mirth of David Jones, the sun of Mr. and Mr. Robert Jones.”

In one sense, sure, all these sentences have a somewhat modified “function,” but really there isn’t a radical difference between any of the sentences themselves. They all have essentially the same structure (sentence diagram), with the small mutations making interesting changes of effect. But the minor adjustment of information merely modifies a pre-existing function, rather than in any sense making a function arise de novo.

But this is a far cry from me getting an English Sentence of that length out of a random letter generator de novo, or anything that resembles one.

The first situation is likely. I could run it on my home computer and get similar sentences. The second would not happen de novo with the fastest computer over 15 billion years by random insertions, deletions, etc., without some kind of teleological information being inserted into the algorithm.

Hence, for this observer, Meyer’s contention that small amounts of information can adjust, tinker, or help organisms adapt seems perfectly reasonable, as does his contention that large, significant amounts of novel information simply don’t arise this way. And as the former is analogous to a few point mutations in a duplicated gene that already codes a protein; the latter analogous to an entirely new protein arising from entirely new de novo information that arose from a frameshift. Thus, respectfully, I don’t think the irony really exists as you see it.

No analogy is perfect, but to be useful an analogy’s conclusions should depend on the points of contact. The conclusions from this analogy seem to depend on what is different between English sentences and genetic sequences.

For instance, the example sentence doesn’t have a single wasted letter. Every word is meaningful. To more closely match DNA, we’d need to insert a few segments of apparent gibberish and a few long strings of one-letter sequences and short repeating patterns. Then include some copies of what look like meaningful sentences that have accumulated various levels of changes that render them more or less intelligible. We should also adjust to a language where there are only four letters, where most combinations of those four letters are valid words, and where most words have multiple valid spellings. This language would also need a flexible grammar in which order the word matters does not.

After making changes like this, it becomes far more likely that accumulated changes will surface new meaningful statements. In other words, once the analogy is adjusted to be more like what it pictures, its conclusion no longer follows.

1 Like