Information = Entropy

You appear to be correct.

@NonlinOrg you have a very intuitive understanding of information. It just does not match the technical definition.

And if we are talking about communicating “across channels” messages that will be “understood by parties”, we are certainly not talking about DNA or any of the machinery in a cell. Nothing at this level “understands” anything, because molecules do not have minds. The definition you have bears no resemblance to biology at a molecular level. I thought we were talking about biological information, but clearly you are not.

Are you saying that any noise is information? Why don’t you look up any definition of ‘information’ or ‘to inform’? Yes, information must have a meaning to the receiver to be more than noise.
Give the Gettysburg Address to a village in China and ask them what information they got out of it. Also what do you think “it’s Greek to me” means?
At a minimum you should have some doubt about who is confused here.

Not my definition. If you have a different one I would like to see it.

In the technical sense, yes, the sending and receiving machines need not understand the message and many times one would send a “random” message to test the losses in the comm channel. But then the received message is compared against a standard to check for accuracy. So yes, even that “random” message has a meaning as 1 bit error means something and 10 bit errors means something else.

Without getting into deep philosophical discussions about intelligence, mind, and understanding, let’s just say DNA “understands” the same way your TV “understands” the broadcast signal.

We most definitely are. In fact it’s striking how much biology has in common with human programming and telecommunication (you’re quoting Shannon, aren’t you?)

So if an army intercepts an encrypted message from the enemy they discard it, right?

1 Like

Glad you asked for a clarification. I am saying that it is impossible to distinguish between information and noise. I could write a program that uses the interval between Geiger counter particle detections as an index into a dictionary of words, and then write those words continuously into tweets every time the program accumulates 120 characters. My Twitter followers might be amazed at some of the fascinating insights my account spouts from time to time! But it’s really noise, even if it looks like information.

Conversely, the Gettysburg Address is really information, even if Chinese villagers think it is noise.

This peer-reviewed article (“On the Reception and Detection of Pseudo-Profound Bullshit”) illustrates the point quite amusingly.

This was just reported by the Wisdom of Chopra site:

“Nature undertakes total acceptance of observations”

And now this:

“The future is beyond species specific joy”

Great information, dontchaknow? :wink:

Which is to say, then, that understanding has nothing to do with the information we find in DNA.

DNA certainly does not understand the information encoded in it. Neither does a TV understand anything of the broadcast signal. In the same way, a piece of paper understands nothing of the grocery list written on it. Neither does the pen that was used to write the list. Neither does a hard drive or an ethernet port understand this conversation we are having.

Yet all these things convey information. Understanding has nothing to do with information.

The meaning is also called “semantics”.

This is random data has a meaning assigned by you. It is a piece of data you want to see reliably transmitted to check the accuracy of a communication channel.

There is some irony here in your prior statement. Remember this exchange?

So, apparently one can assign semantic meaning to random data at will. And it can be transmitted even if you are the only party that understands it’s significance and how it should be interpreted.

This is the issue though @NonlinOrg. Who can actually quantify the among of meaning or information there is in a message without understanding it? This is impossible. We never know if what looks like random noise to us is actually meaningful and important information to someone else.

DNA is certainly not designed to communicate with human scientists. We cannot even process short stretches of it in our brain, and have to rely on computer software at every step to even to begin to think about it.

So, here is the real question, the real task we are faced with. I can give give you two sequences:

  1. a sequence of DNA that is totally random (and I will not tell you how I generated it)
  2. a sequence of DNA the same length encodes a biologically important function.

We can extend this further. I can give you as many pairs of examples as you like (thought not infinite =).

Please tell me how you will determine the amount of information in each of these DNA sequences? Do you expect to be able to easily tell the difference between the two? How would you quantify the the amount of information in them? Can you quantify the amount of order? Would you be able to determine which one was random vs. functional? What mathematical formula or algorithm would you use?

My point is that it trivial for me to give you sequences where:

  1. Neither you nor experts in the field would be able tell the difference between the random and functional sequences.
  2. Neither could anyone write a piece of software that could tell the difference.
  3. Neither could anyone design a biological experiment to discriminate the two sequences.
  4. Neither could anyone even compute the true entropy of these sequences. Because I generated the random sequence, I will be able to tell you the true entropy.
  5. Neither can anyone even compute the true information content of these sequences. Because I know which one encodes the biological function and which one is random, you would have to give me different numbers for each sequence.

If you think I am wrong, you can always take me up on the challenge. We can see how far you get.

This is the core of the problem. If you cannot understand the data entirely, you have absolutely no way to confidently f answer the important questions about it. Applying a formula to it or qualitatively reasoning about it gets you no where of consequence. You might as well just be staring at static on a TV screen. The fact that it looks like static to you tells you absolutely nothing about what it really is.

If this is true, and it is. What exactly is the information theory argument for Intelligent Design?

Just a heads up regarding the wave equation (total energy of a specific system/molecule) - the Gibbs equation relates the energy at a specified state/temperature (Gibbs free energy) with the enthalpy and entropy:

DG=DH-TDS

The QM computation normally provides DH (enthalpy) at standard states, and we would need to compute the change in entropy for a real system at the temperature. For reactions and structures to be energetically favoured, DH and DS would both contribute to the added stability.

I think the discussion(s) on information may not be all that relevant to these calculations (but please correct me if I have missed something).

Originally I thought he was talking about QM too, and maybe he was. My original text also emphasized that QM computed enthalpies, not total energy. But the problem is deeper though, because eigenvectors in the QM simulation do not correspond to states exactly (I got turned around there), rather they correspond to molecular orbitals.

To get the states to which entropy refers, we really have to be dealing with dynamics, and considering the full partition function where states refer to the coordinate positions of all the nuclei (and some details like total charge). Then the energy usually corresponds to the sum of eigenvectors of the the occupied orbitals here. And we compute the entropy over that partition function.

I think the challenge of the original question is that the way it is phrased (with reference to an eigenvalue of a hamiltonian) gets one thinking about QM simulations. And because states are connected to different eigenvectors (by way of the same eigenvalue), this wrongly maps the entropy states onto QM wave solutions. There is another nuance regarding the difference between closed and isolated systems. Really, the key question has nothing to do with QM, Hamiltonians, and eigenvalues. Instead, the question is if total energy is constant, how can entropy increase?

Then answer is very easy. Total energy includes kinetic and potential energy, not entropy. So even if total energy is fixed, entropy can increase.

I am not sure what you mean. Separating information from noise has been a very important human activity for thousands of years. I bet you will be very angry when your phone start being noisy and your service provider tells you that “really there’s no difference”.

Absolutely not. I explained very clearly that information is between A and B. Hence encryption which tries to control the flow of information only between specific two parties.

Looks like we have a standoff.

What do you think? And why do they do that?

It is strange that you quote and then proceed to completely ignore this statement: “You can just assign semantic value at will, but that is not information as it lacks common understanding to both A and B.”

Information takes at least two parties A and B. Sometimes the same message can carry a piece of information for B and a different piece for C, D, etc. depending on what pertains to each one of them. Encryption seeks to limit the information only to B and not to C etc.

Information is not intrinsic to the data (the DNA) - the same exact data sequence carries one information in this context and different one in another context.

Don’t know but I strongly disagree with Dembski saying: "The fundamental intuition underlying information is not, as is sometimes thought, the transmission of signals across a communication channel, but rather, the actualization of one possibility to the exclusion of others. " - Intelligent Design as a Theory of Information | Discovery Institute
What he calls “actualization” seems to be observation which means he mixes information (transmission) with knowledge (observation). You might be doing the same.

Btw, I have yet to figure out the exact disagreements between Biologos and ID.

Yes, all these media (DNA, TV, paper, pen, hard drive, ethernet, etc.) convey information. They do not understand but we, their users, do.

On a tangent note, DNA is overrated. 1.5 GB of data (fits easily on a thumb drive) can hardly specify a dishwasher let alone the human organism and its development process. Looks like we don’t know yet where the information for organism development is held.

I was joking. Of course they will try to decrypt it.

Yes Joshua, QM deals with the energies of molecular orbitals and it can also give us the total energy of a moleculeat given conditions (and other information depending on the rigour of the methodology).

I think information theory uses entropy as disorder, and noise is easily associated with this notion. I have only read a few papers on this interesting area and I think it is a field of its own requiring specialist understanding. I will indulge in a vague analogy if it helps - Fourier transform can be seen as turning a “jumble” (or noisy) information containing bundle into a comprehensible spectrum, and as the information content increases (or is clear) the noise is reduced. I will leave it to you if this helps on the matter of entropy/noise discussion.

I think the claim being made by Swamidass and Chris [or at least a necessary consequence that follows] is that there is no mathematical algorithm or formula you could apply (no possible device you could build or experiment you could conduct) that would determine in some empirical manner whether or not some string of random looking characters was in fact really information to somebody or something somewhere. Yes --in obvious cases such as all of these characters I’m typing right now, you could program a device to look for English words, and upon finding them it could dutifully report that it found information. But that is because some English speaking programmer decided to look for English words. If it were looking for any kind of generic information from any known (or perhaps unknown — cue SETI) sources the device would be helpless and in fact its task an impossible one if I understand Joshua and Chris correctly.

Would it be fair to give this generalization: If information (of any kind – known or unknown) were truly lacking from a string of “data” (i.e. it was truly a string of ontological chance as if such a thing were even possible), then understanding of it (by anybody anywhere) is impossible. But the inverse situation is not true: lack of understanding by one or any number of parties does not prove lack of information as there may be (or have been in the past) somebody or something to whom it was information. To summarize more succinctly: a true lack of information necessarily implies absence of understanding, but absence of understanding can never prove lack of information.

If the last native speaker of some (about to be lost language) writes his last diary entry in his native tongue, and then dies – the mere fact that there is no entity left in the universe that can understand his diary entry would not make it cease to be information. It is simply information that we’ve lost the capacity to decipher, but it would remain information just the same. So I don’t think your statement is correct that information must always have an understanding receiver. Just a tree in the forest does not require that somebody be there to hear it before it can make any noise.

Joshua or Chris, I trust you will correct me if I’ve misrepresented or misunderstood.

[edits made to hopefully add clarity]

2 Likes

Thanks for the thoughts @Mervin_Bitikofer.

We are using information in two ways. One is the definition of information as measurable entropy. The other one is the definition of information as semantics or meaning (which is the use that @NonlinOrg seems to be using ). We have to keep these things separate, because the are different, or contradictions arise.


Regarding semantic information, this is true:


This is not true: “Any Kind” is too broad:

A better way to put this is that if the Entropy information of a string truly is maximal (i.e. the data is totally unordered, as in quantum noise or a one-time pad), then understanding of it by anyone is impossible. This has actually been proven (One-time pad - Wikipedia)/

However, at times it might produce information that has the mirage of meaning, and resembles things that are meaningful to. We also can also, just because we want to assign meaning to it. For example, I can take a chunk of it and use it my email signature as a geeky decoration. Or I could use part of it as a private key for secure communication (then it semantically becomes “my key”).


Not exactly…

SETI does not just look for “unexplained information” in the data, by just modeling known processes. That would be a grand failure, because we see noise everywhere. One could argue from this, for example, that discrepancies in galaxy rotation speeds (which are seen as evidence of dark matter) are actually evidence of intelligence. This is “unexplained information” after all. That would absurd of course, because there is no alien intelligence we can imagine (short of God) capable of producing that signal in the data.

Rather, SETI models specific theories of signal generation and intelligence, with the assumption that we are looking for finite beings (not God). These models can be tested directly against data to see if they better explain information that models without it. And then there is intense scrutiny to find natural phenomenon that might explain the signal.

In this approach, it is possible that SETI could identify intelligence if they can conceive of a model that produces the right signal in the data, and this model does not infer God-like powers.

Now being able to interpret an understand intercepted message between aliens? Well, my baseline belief is that this would be impossible. About as difficult as decrypting a one-time pad (One-time pad - Wikipedia). It would look just like noise to us till we understood much much much more about our alien friends. And unless they wanted us to understand, I am sure we would not.


So this is a bit to vague…

I would put it as…

Inability to identify semantic information only implies lack of understanding; this cannot prove that the semantic information does not actually exist.

2 Likes

@NonlinOrg

Perhaps this is the perfect evidence to suggest your sense of the word Information is irrelevant.

In the 1.5 GB of data of the human genome, a human can be created. And yet, ironically, nobody knows how to read the genome fluently.

Looks like Biological Information works (or doesn’t work) regardless of anyone being around to understand it.

You should stop playing word games, and just get on with trying to make a point.

I am puzzled by (what may not be intended in these discussions) the implication that noise=entropy=information, for the following reason:

If we measure noise from a particular source, and accumulate it over time, the signals tend to cancel each other (or we can eliminate them by using the appropriate instrumentation), whereas if a signal is present within that noise, we may identify it (as meaningful) by eliminating the noise.

If I am correct, doesn’t this contradict the notion information=entropy=noise?

Otoh can we argue the “noise” contains so much information (ie it is the product of many sources) that we are unable to process it?

Thanks for your corrections and clarifications, Joshua. I had to look up what a “one-time pad” was, so my education continues with these posts.

Like looking at clouds (noise of a sort?) and finding resemblances in them.

I don’t fully understand your clarification here. It isn’t like we would expect to find aliens speaking languages that we know … that is except perhaps mathematics, or also if the aliens had spied and had prior knowledge of us. I’m remembering the movie “Contact” in which the first clue that a signal was not noise was that it came in prime numbered pulses – which was taken to be one universal way for intelligent species to say “hello” to each other. So it was a lowered-entropy situation that had to function as the attention-getter to set it apart from noise. Then of course, deeper communications were found embedded in the same signal. And those more detailed communications by themselves would certainly look like noise. In that situation, though, the communicating aliens had already spied on us and so had already designed their communications for our reception.

This is mostly me just thinking aloud, to see if these popular contexts can be useful vehicles to help make some of these concept more accessible (to me anyway.)

I agree with you, Nonlin, that my statement is phrased too extremely. Using your helpful feedback, here’s how I would reformulate my statement:

It is impossible to provably distinguish between information and noise in any particular signal.

One example of this would be a code in which every 11th word is significant, and the rest is made up as filler. If the insignificant fill is done with any skill, a third party that intercepts the communication would not understand the information that is embedded in what is predominantly noise.

Another example is the one-time pad, as @Swamidass mentioned.

Here is another example which you inadvertently overlooked:

Best regards,
Chris

In considering the elusive “information with meaning”, which certainly exists because this is some, it seems as if it’s impossible to distinguish signal from noise without considering the sender and the receiver: in some way a message must be seen to convey a communication between a sender and a receiver.

We watch a telegraph operator (or a spy, or an alien) receiving messages which appear as gobbledygook to us. But if certain messages appear to provoke some significant (maybe repeated) response (the operator raises a flag, or sends a reply), we suspect that message to contain meaning, even if we can’t translate it.

Similarly a recording of the last speaker of a language may be an actor speaking gibberish, but if that last speaker is still alive, and mimes the meaning of his words, we know it’s speech and can even learn the language. We can, and do, work out the significance of animal signals in this way, though they are not trying to teach us, and though there is a qualitative difference between language and mere signals.

Merv’s prime number example is based on a shared assumption by the receivers and the aliens (or at least the scriptwriter!) that maths is a universal language: the message itself is statistically indistinguishable from noise, but its effect on us, producing a conviction of a similar effect on the sender, tells us there are folks out there.

A “black box” machine, reading code, is surely accessible on the same criteria, being an active, if not conscious, receiver of information. If it sits there doing nothing until a certain combination of characters appears, when it whirrs and spits out a widget, then the response is the evidence for something other than noise. The response might be complicated: for example, like Chris’s example, only every 11th character or message might be heeded, but further understanding would be gained not from studying the code alone, but the machine’s responses to the code.

As far as I’m aware, information theory does not have much to say about receiver reponses, and it’s in those that “meaning” resides (whether a simple instruction or all the nuances of a human poem). If there’s an application to DNA, surely it’s in matching coding to the organism’s responses.

Beyond that there must come a “universal language” inference like that of Merv’s prime numbers: in this case the inference that the messaging between generations of organisms involves teleology at the biological level, or at a higher level of creation above that.

Without that, we are indeed talking about faces in the clouds, pareidolia: the mere illusion of meaning. But then we have to explain why nature’s “receiver” responds to an illusion as if to real information. I’m not sure even Shannon envisaged information being transmitted in a medium apart from teleology.

George, my main use of noise reduction is in recording music, when the noise is usually (strictly speaking) an unwanted part of the signal I’ve put in myself, such as mains hum or amplifier hiss. So I will be distinguishing noise from signal by selecting for sampling a section where my repetitive but creative bass is not included to be considered noise. Too much noise reduction destroys signal too.

I assume that all noise reduction algorithms will have some similarly intelligent criteria for deciding in advance what will qualify as signal. And that underlines to me that “signal” cannot be considered apart from some assumptions about both its sender and the receiver.

Jon, the discussion seems to focus on entropy, and this seems to be equated with noise. Entropy in my field is a property of a system and the energetics of reactions are understood in terms of making/breaking chemical bonds, and the entropy of the resulting products. e,g, let us say we break up H2O into H2 and O2 (2H2O => 2H2 + O2). We need energy to break the H-O bonds, but the entropy increases because we have gases instead of a liquid. In this case, we understand what entropy means.

I have frequently dealt with instruments that have a given noise to signal feature, and we often reduce noise by accumulating the signal of interest over a number of scans. In this way, the signal strength grows, and the noise is reduced as a consequence. In this case, I cannot see how entropy would be relevant.

If we consider a receiver that simply responds to every audio (and perhaps other parts of the spectrum) signal, we would have a jumble, but that jumble would contain information on every audio+ source that emits the various frequencies - the jumble(if I understand Joshua correctly) would be very high disorder (the meaning of disorder may also be entropy for this example), and have high information content. It may be extremely difficult for us to extract all of the information, although experts can identify some portions of such a jumble (eg a car horn)…

I agree that we must accept a source(s) for these matters, and a receiver. In terms of digital signals that are sent across fibres or wires, the technical details would be more complicated, but I think we still must think of sender, medium, receiver as the overall context.

When information is discussed for DNA, I tend to retreat, as the context is so vastly different.

Perhaps @Swamidass Joshua may wish to elucidate further.