"Devolution" and gene loss in evolution

This discussion was in the context of Behe’s review article several years ago in which he attempts to develop a “rule” of adaptive evolution based on experimental evolution results in bacteria and viruses. I thought the topic and the comments were worth a new thread.

Dr. Gauger writes that the PNAS paper describes yeast “deleting their own genes.” When I saw her comment, I was amazed that I had missed such a significant discovery: the deletion of genes in response to selection would be huge news (like, NY Times kind of news) for a few reasons, but the most notable to me is the fact that no known organism has machinery for deletion of genes. Perhaps Dr. Gauger did not mean that the yeast actively deleted genes, but that’s what the sentence implies. So I read the PNAS paper excitedly.

It does not report that yeast delete their own genes. It doesn’t even report the much less surprising process of selection acting on yeast strains that have undergone accidental deletion of genes. And its conclusion is very different from the implication that yeast will undergo gene removal when evolving under strong selection.

The paper is actually about an interesting and incompletely resolved question about trade-offs in evolution centered on the expression of genes. The premise is that making RNA and protein incurs a cost, and so unnecessary production of either or both should be subject to selective pressure whenever the cost/benefit ratio reaches some threshold. This seems obvious, and it is certainly true in principle, but efforts to demonstrate and measure a selective effect of deletion of “optional” genes, especially in eukaryotes like yeast, had been unsuccessful before the PNAS paper was published in 2009. The authors decided to look again, in the context of mating. You may know that most fungi can reproduce sexually (mating) and asexually (budding, spore-making, or simple division). Mating is a somewhat complex process that involves a dedicated set of genes, and so it’s a good scenario for the study of trade-offs. The experimental question is this: if a yeast population is forced to reproduce asexually (without mating), will it shut off its mating genes? This, presumably, would create a competitive advantage, due to the saving (not wasting) of resources.

Their approach was straightforward. First they did an experiment to select for sterile yeast. They then took a set of these sterile mutants and asked whether they enjoyed a growth advantage over their fertile relatives. They found many that did (and some that didn’t). Then they identified the mutations in several of the most successful mutants. This is one of the great advantages of working in yeast: even almost ten years ago, there were technologies and resources readily available to identify the exact mutation of interest in a yeast strain. This is the point at which they would have found genes deleted by yeast in order to gain a competitive advantage.

Not one of the mutants had any genes deleted. The mutations were tiny changes that made dysfunctional (or non-functional) proteins. In all but one of the strains, a single gene was mutated.

The mutations didn’t delete any genes. What they did was inactivate one gene, and that change led to reduced expression of 23 other genes, none of which was deleted or even modified. I will paste the authors’ conclusion below, but the lay version is this:

Loss of gene expression can be advantageous, but only when you can turn off a bunch of genes at the same time and probably only in a large population where the very small resulting advantage can be seen by selection. This is a far cry from yeast deleting genes they don’t need, and that’s why this interesting paper wasn’t in the New York Times in 2009.

Gene loss is interesting, and it’s undoubtedly a big part of evolution. We should discuss it accurately, in the context of the full scientific literature.

Conclusion of PNAS paper:

Here, we provide evidence for a general cost of gene expression and find that elimination of the expression of 23 genes results in a 2% growth-rate advantage. Assuming that each of these genes contributes equally, the growth-rate advantage attained by eliminating a single dispensable gene is <0.1%. The fate of mutations whose selection coefficient is >1/N is dominated by selection; therefore, for population sizes greater than ≈103, such as panmictic microbial populations, selection will oppose unnecessary gene expression, but for small or subdivided populations, drift will dominate for all but the small fraction of strongly expressed genes. Selection for sterile strains during long-term evolution and for the GPA1-G1406T allele supports the hypothesis that selection can optimize the level of gene expression to balance the cost of protein production and the demand for protein function, and argues that proteins that do not increase fitness will be lost.


Sorry. I was wrong. I should have checked my memory of the paper. The yeast inactivated a gene that reduced expression of 23 other genes. And the point is valid–the inactivation was only of value if multiple genes were involved. But a 2% growth advantage makes a difference.
The point remains–the cost of expression has real consequences. In the case of the yeast, changes to a single gene affected the expression of 23 others, resulting a 2% growth rate advantage. The authors say, “[…selection can optimize the level of gene expression to balance the cost of protein production and the demand for protein function, and argues that proteins that do not increase fitness will be lost…”

Try this one. In this system two plasmid-borne tryptophan (trp) genes were overexpressed in cells that contained the rest of the trp genes on the chromosome. Cells were grown for many generations in conditions of limiting tryptophan. One of the trp genes had 2 point mutations, one completely inactivating, and the other partially inactivating. Theoretically it should have been a simple process for the cells to evolve full tryptophan synthesis, since only those two mutations were preventing tryptophan biosynthesis. But only 1 in 10^12 cells did so.

The abstract:

New functions requiring multiple mutations are thought to be evolutionarily feasible if they can be achieved by means of adaptive paths—successions of simple adaptations each involving a single mutation. The presence or absence of these adaptive paths to new function therefore constrains what can evolve. But since emerging functions may require costly over-expression to improve fitness, it is also possible for reductive (i.e., cost-cutting) mutations that eliminate over-expression to be adaptive. Consequently, the relative abundance of these kinds of adaptive paths— constructive paths leading to new function versus reductive paths that increase metabolic efficiency—is an important evolutionary constraint. To study the impact of this constraint, we observed the paths actually taken during longterm laboratory evolution of an Escherichia coli strain carrying a doubly mutated trpA gene. The presence of these two mutations prevents tryptophan biosynthesis. One of the mutations is partially inactivating, while the other is fully inactivating, thus permitting a two-step adaptive path to full tryptophan biosynthesis. Despite the theoretical existence of this short adaptive path to high fitness, multiple independent lines grown in tryptophan-limiting liquid culture failed to take it. Instead, cells consistently acquired mutations that reduced expression of the double-mutant trpA gene. Our results show that competition between reductive and constructive paths may significantly decrease the likelihood that a particular constructive path will be taken. This finding has particular significance for models of gene recruitment, since weak new functions are likely to require costly over-expression in order to improve fitness. If reductive, cost-cutting mutations are more abundant than mutations that convert or improve function, recruitment may be unlikely even in cases where a short adaptive path to a new function exists.


1 Like

Thanks for sharing this. You were looking for mutations in plasmids that were overexpressing non-functional proteins? I’m not sure I follow.

I’ll try to explain. Ralph Seelke originally set out to verify how long it would take to get a double reversion, He used a plasmid that had trpA and trpB on it, a multi-copy plasmid that expresses lots of gene product. (Not relevant now, but it will be.) He introduced two single base mutations into trpA that had previously been shown to be null mutations, call them X and Y. He put the plasmid into cells that had all the trp genes except trpA and trpB. These cells could now only grow in the presence of tryptophan in the medium. He proceeded to grow them by serial dilution every day, like Lenski’s long term evolutionary experiment, only in this case the limiting nutrient was tryptophan. The cells could double about 7 times and then run out of tryptophan. So any cell that reconstituted the ability to make tryptophan would sweep the population.
Midway through, Ralph found out that one of his mutations, X, was not a null. It had a very slight level of activity. But this posed an immediate puzzle. With X having slight activity, if the cell reverted Y first (which was a true null), the cell would gain slight tryptophan biosynthesis ability from X, which should allow it to spread in the population, and then rapidly revert X, restoring wild type trp function. A two step selectable path should have happened in just a few days–bacterial cultures have large population sizes. But it hadn’t happened over several years of daily transfer.

Here’s where we came in. We already knew that X and Y singly reverted just fine. Yet the two together did not follow the path. We had multiple independent populations that had been growing long term, so we began looking at what had happened to the genes. In every line, trpA had been inactivated, either by point mutation, rearrangement, transposition, deletion or insertion of an IS element. All the lines had inactivated expression of the gene because it saved energy–they were no longer expressing useless protein (see the last line of Botstein’s paper above). The first cells to do that took over the population. And the way to the two step pathway was sealed. None of them would make tryptophan again.

There’s lots more to the story–many controls, verification of doubling times, growth advantage etc. Our take home message–overexpression of useless protein leads to a selective advantage for any cell that inactivates that expression.

There is a cost to expression. And as Botstein’s group says, it is balanced by the utility of the product. The cost may be at the level of DNA, RNA, or protein, I don’t know. However, I do know that bacteria duplicate DNA and then remove duplications quite rapidly, on the order of 10^-4. It’s a good strategy. If a duplicate proves useful it is kept. If not it is lost, probably by homologous recombination. Mira et al 2001 (cited in my paper) report that parasitic and symbiotic bacteria rapidly lose unnecessary genes. Lenski’s group has reported the loss of genes as an adaptation to growing under restricted conditions.

It is interesting that when this cost is strong enough it will prevent evolution along a selectable path. The reason–there are many more ways to inactivate than to restore a point mutation.

I guess we have differing opinions on what a “simple process” is. I would call a 1 in 10^12 emergence of tryptophan biosynthesis a simple process. Afterall, we are talking about an enzyme that (according to the standard model) probably took millions of years to evolve in the first place in populations that would dwarf 1E12.

There is also the case of a novel B-galactosidase that evolved in E. coli (Hall 2003). While the novel enzyme wasn’t nearly as active as the one that was deleted, it was still active enough to be selected for in the presence of lactose.

As to devolution, that is a meaningless term. That’s like saying you are detraveling if your car points south for a mile while trying to reach a destination to your north. You are travelling if you are moving, and the same applies to evolution. As long as populations are changing they are evolving since evolution has no goal to begin with. The only situation that I would define as devolution is if the sequence of a genome became more and more like the genomes of their ancestors which we don’t see happening with any regularity.

1 Like

I’m sorry, but you need to read the paper more carefully. The entire trpA gene was there. It just has two point mutations that needed fixing to have a fully functional enzyme.

Bacteria have a mutation rate of about 10^-8 per bp per generation. With a genome size of 4.4 million bp, and a population size of about 10^8 cells per vial (in restricted medium), any mutant base is likely to be reverted overnight. This is assuming, however that the trpA gene is still being expressed. Which, it turns out, it isn’t.

I made a mistake in what I told you earlier. The observed reversion was to only the first step, not a reversion of both steps. So that means the reversion rate was 10^-12 rather than 10^-8.

"In evaluating any evolutionary scenario, it is important to distinguish
between what is theoretically possible and what is actually
apt to occur. Because real adaptive landscapes (i.e., real
mappings of fitness onto genotype space) tend to be complex, with
many possible reductive paths as well as constructive paths, it is
not enough to show that the individual steps of a particular path
have a selective benefit. It is also necessary to demonstrate that
each step is likely to be taken, given the whole set of competing
paths open to the organism.
A number of authors have recognized the complexity of adaptive
landscapes and have examined the impact of this on evolutionary
trajectories, focusing on the importance of genetic interactions
(epistasis) as a limit to adaptation [29-35]. Here we point
out another potential factor—the cost of gene expression—that
can significantly affect the likelihood of a simple two-step adaptive
path being taken. We also highlight the effect of selective
conditions on the outcome of particular evolutionary scenarios.

"[quote=“T_aquaticus, post:5, topic:36902”]
As to devolution, that is a meaningless term. That’s like saying you are detraveling if your car points south for a mile while trying to reach a destination to your north. You are travelling if you are moving, and the same applies to evolution. As long as populations are changing they are evolving since evolution has no goal to begin with. The only situation that I would define as devolution is if the sequence of a genome became more and more like the genomes of their ancestors which we don’t see happening with any regularity.

Fine. Not my term. Call it reductive evolution if you want to.

Are you disagreeing with neutral theory?

I do not see how your one trial with one selection protocol with one strain of one species is sufficient to show that.

How does your failure contradict successes like this one?


I see you are not a microbiologist. This is a standard number for E coli, and is used in this case merely to indicate that reversions of point mutations happen easily in E coli cultures.

[quote=“RHernandez, post:7, topic:36902”]
On that subject, I noticed that the word recombination does not appear anywhere in your paper. Is not recombination a major generator of variance?


Huh? I see once again that you are not a microbiologist and have not understood the paper. Recombination with what? Actually, the only place where the term recombination might have been used would be where all the ways the trp A gene was inactivated was described. Some of them were by recombination.

No. Neutral theory says “most evolutionary changes and most of the variation within and between species is not caused by natural selection but by genetic drift of mutant alleles that are neutral.” Wikipedia

Genetic drift no doubt continued in the bacteria. But they were under strong selection for the ability to grow in minimal media. Only the ones that hit on a way to grow faster were able to outcompete their neighbors. There were two ways to do it. Eliminate expression of the trp genes and reduce costs, or recover the ability to make tryptophan. They almost universally eliminated expression. None of them recovered the ability to make tryptophan.

It showed it. So did Lang GI, Murray AW, Botstein D (2009) The cost of gene expression underlies a fitness trade-off in yeast. Proc Natl Acad Sci U S A 106: 5755-5760. doi:10.1073/pnas.0901620106

They aren’t asking the same question. They are asking what genetic changes enabled the growth of E coli on a new carbon source, not whether overexpression could block the evolution of that ability.

You know, I can be patient most of the time but it is really hard, when someone who clearly doesn’t understand the subject matter acts like he does, and asks snarky questions, not to be snarky in return. I am officially falling off the wagon now.

Read the paper, and if you don’t understand it, don’t pretend you do.

I get that some people here are out to prove that ID researchers can’t do a decent experiment, just on principle. This paper describes experiments that could be published in any microbiology journal. In fact one of the reviewers said he thought it should be in PNAS. But given the reaction this paper faced here, you can see why we didn’t try.

Let those of you who are scientists and actually understand the paper give a valid critique and I will be happy to respond. I would be grateful, if you understand the paper, if you would also acknowledge its merits.

Pardon the interruption, but I think part of the communication problem between you and Roberto is that he is not a native English speaker.


Assuming that you are talking about this paper (which you are first author on, no need to be modest about it ;)), I have a few comments.

  1. Expression of the trp genes was way out of whack compared to native genes. “The parent plasmid pWS1 is known to direct substantial overexpression of the trpCBA genes, such that their products represent about 15% of the total soluble protein [21].” Yikes! Obviously, there is going to be very strong selection against expression of these genes in amino acid limiting conditions. If time and resources were infinite I am sure that you would have preferred to put these genes into the chromosome under native promoters that would more closely mimic the expression of genes in other de novo pathways, but a tunable or lower expression promoter on the plasmid would seem to have been a better choice than hitting it with this hammer.

  2. Tryptophan did not seem to be limiting in your liquid cultures. It is stated that liquid cultures saw the usual 6.64 generations from a 1% inoculum even in the 1 ug/ml tryptophan culture media which gets you to 100% saturation. This means that carbon is the limiting resource and not tryptophan. This would greatly reduce selection pressure for trp+ mutations and switch the selection pressure to lowering that massive trp expression (15% of total protein weight!).

With these conditions, it isn’t any wonder that selection tipped in favor of lowering expression of unneeded proteins or those that only offered a slight increase in fitness due to the fact that tryptophan was not limiting in the liquid culture.

As to the overall conclusion, I would fully agree that there are conditions under which the deletion of genes is going to be more favorable than the evolution of beneficial genes. However, this stands in contrast to the Lenski experiment where citrate was just sitting out there as a possible carbon source that no bacteria were taking advantage of. The presence of citrate did not favor gene deletion in and of itself due to a lack of substrates needed for protein production. When you are talking about amino acid de novo synthesis you are already battling two competing forces that include conservation of amino acid use in the translation of proteins.

Thanks for bringing this paper to my attention, and feel free to tear apart anything in this post if I got it wrong. Once again, you have my respect for doing the work, publishing it, and being willing to discuss it in forums like these.

1 Like

It looks like they used fluctuation tests (i.e. Luria-Delbruck) to measure the mutation rates that converted the trp- single mutation clones to trp+. From what I can see, the mutation rates look fine and don’t affect the conclusions that much.[quote=“RHernandez, post:7, topic:36902”]
I do not see how your one trial with one selection protocol with one strain of one species is sufficient to show that.

Let’s not forget that the 3 genes on the expression plasmid were under the control of such a strong promoter that they made up 15% of the total protein expression. In amino acid limiting conditions this would make the tet promoter on the plasmid the #1 target for beneficial mutations that lower protein expression. Other than ribosomal RNA, I doubt there is a gene in the E. coli genome that is more heavily expressed than these three genes in the strain they used.

1 Like

I agree. If I could, I would want to see see the study done from chromosomal genes with tunable promoters. The huge amount of overexpression definitely made a difference in doubling times among the different strains, and so inactivation was immediately beneficial.

My point about Lenski is that any time cells are under starvation conditions, cells that reduce expression of non-essential genes have an advantage.

BTW have you seen these? http://jb.asm.org/content/early/2016/02/10/JB.00110-16.full.pdf
Rapid Evolution of Citrate Utilization by Escherichia coli by Direct Selection Requires citT and dctA - PubMed

I would also be interested in studies where the concentration of tryptophan limited the liquid culture to an OD600 of 0.5-1.0, or thereabouts. Just a thought.

It is interesting that aerobic citrate utilization emerged after a recombination even that put a functional promoter in front of the citrate enzyme. This would seem to fit into your model.

On another note, from the Hofwegen article:

“We conclude that the rarity of the LTEE mutant was an artifact of the experimental conditions and not a unique evolutionary event. No new genetic information (novel gene function) evolved.”

Lucy pulls the football away from Charlie Brown yet again. As I stated before, if we knew every mutation that has occurred in the human lineage since the common vertebrate ancestor I doubt if some of these ID proponents would classify any of them as a gain information. At some point you have to ask if evolution needs to produce new genetic information as defined by some ID proponents in order to get the biodiversity we see today from a common ancestor.


I do not think that “I see you are not a microbiologist” is relevant or an answer to my sincere question.

I do not think that 10E-8/bp/generation is a standard number, as everything I have seen has almost 100 fold lower frequencies. Some of these are per gene, which should be divided by 1000 to get a bp frequency.



The other point is that rates vary.

Recombination between the mutations of interest. [quote=“agauger, post:8, topic:36902”]
No. Neutral theory says “most evolutionary changes and most of the variation within and between species is not caused by natural selection but by genetic drift of mutant alleles that are neutral.” Wikipedia

Then I do not understand why the first sentence of your paper, which I am reading carefully, is “According to standard evolutionary theory, every adaptation, even the most sophisticated, is the product of a series of simple adaptive steps.” because that does not include neutral evolution. Please explain.

I understand. I also understand that the failure to recover that ability could be caused by something you did not check or something that could be peculiar to the one system you choosed or one of the mathematical errors we all make in the lab.

What I do not understand is why you think this single case is sufficient for a general conclusion. In the introduction, you say that “If these reductive (cost-cutting) muta- tions are sufficiently numerous, then adaptations that are theo- retically possible (because the paths to construct them exist) may nonetheless be unattainable.” This seems obvious to me. Why is this a limitation of evolution?

I think it should be clear that if one failure does not establish a general conclusion, that two does not either.

You referred to that question in your introduction.
“Typically, experimental studies of this process look for genes whose products are able to metabolize a new compound or re- place a missing function [11-13]. But since the recruited gene product often performs its new function very poorly, it is likely to require over-expression to have selective benefit [7,11-13]. This means that the benefit comes with a metabolic cost.”

In response, I presented a successful case that is unlikely to have a high metabolic cost. I thought that you might find it of interest.[quote=“agauger, post:8, topic:36902”]
Read the paper, and if you don’t understand it, don’t pretend you do.

I read carefully, I understand most, and for parts that are not clear, I ask questions. I do not see snark in the questions. I am not giving a critique. I am not pretending.

I agree. It is fine for the plasmid. Dr Gauger was claiming it as a general rate for bacteria.

I agree.

The paper lacks sufficient documentation to support the negative general conclusion also. For example, the paper says that mutagenesis was confirmed by sequencing. When I write that, I mean that I have sequence the open reading frame. But I will be expressing protein from those plasmids, so I will know soon if there are other mutations that affect expression. So I do not think Dr Gauger will know if the mutagenesis made other mutations that would change expression. I also wonder about assay they use to measure expression. Why not measure expression by northern or western blots?

Dr Gauger, those are critiques.

That is a very good point and would have been easy to do.

The Lee et al. (2012) paper, which you reference, has 1 mutation per 1,000 bacteria (1x10^-3 per genome per generation). The E. coli genome is approximately 4.5 million bases, so that would be 1 mutation per 4.5 x 10^-8 bases according to Lee et al. (2012) [added in edit: that should actually be 4.5x10^-9]. That seems to be in the same ballpark that Gauger et al. (2010) are reporting. They also seem to be using a lab strain that may have slightly higher fidelity than the wild type strain discussed in the Lee et al. paper.

Or measure expression by modern SYBR based qPCR or end point PCR. I think we need to allow for some leeway since their lab may be somewhat underfunded and underequipped. I can’t imagine it is easy setting up a modern molecular biology lab without federal funding and instead trying to get by with limited charitable donations they may have access to.

I don’t follow your last statement. Does evolution have to produce new information? Yes. But it’s a legitimate question as to what counts as new information. They qualify their use of the term to mean novel gene function, and that is correct. What was new was when the cit gene was expressed. The gene itself was not new. The promoter also was not new. It was moved into proximity of the cit gene and that is what allowed the new phenotype. It’s rearranging existing DNA.

Now if you want to consider information based on phenotypic change, then yes, there is new information.

It’s a legitimate question: what do we mean by gain of information?

The ID people I think you are referring to, who will not credit a gain of information even from the foundation of the vertebrate lineage (slight hyperbole) are probably of the genic kind. And extreme. But is a rearrangement a gain of information? In the citrate case it would definitely be a gain of function mutation IMHO. But new genetic information? Don’t know.

Thanks. You have at least some idea.

The cultures reached saturation at about 10^8 per ml as opposed to 10^9 per ml in rich medium. I don’t recall what the OD was.