What Is Information?

Just for the record, the proper pronunciation of π is “pee,” and I have commemorated the day accordingly.

3 Likes

I think you’re right about what Swanidass et al. are saying and they’re wrong.
The first point is that there is no such thing as Shannon Information. If you go back to Shannon’s 1948 paper he did not define the information in the message but the Entropy of the message. Shannon was dealing with the specific technical problem of how to faithfully transmit a message from sender to receiver. The Shannon Entropy provides useful information ABOUT the message but it nothing to do with information IN the message, something Shannon acknowledged in his paper.

If you look at my example above of the Gettysburg Address;

; you will see how overwriting it with random noise increases the Shannon Entropy but destroys the information in the message.

In fact this is what you do to safely delete a computer file; overwrite it with random data (several times) so that the information in the file is destroyed. The file can still be read but the information in it is destroyed.
(I just though of this example, it’s one IT professionals could probably relate to.)

This is not quite right. Originally files were deleted by removing the information that pointed to where they were stored on the disk. It was possible to still read the information in the deleted file if you could find where it was on the disk. To prevent this the area of the disk used for a file is overwritten with any data, all 0’s or 1’s actually works. I have no idea how this is supposed to apply to this discussion. Replacing information with other information is what is actually happening in your example.

From a computer science perspective Shannon’s information entropy tells you the minimum bandwidth needed to transmit a signal with a given maximum frequency. It is also related to how much a file can be compressed and still recovered completely. Both of these are examples of information from my perspective at least.

1 Like

What Does File Shredding Do?

To shred a file, you run it through a program that overwrites it several times with other data. It doesn’t actually get “shredded” in the sense that paper documents do.

To use that analogy, it’s more like you’re taking a paper document, erasing all the words, and writing over the top of them with a bunch of nonsensical words. And just like erasing the word off a page will leave a trace behind, it’s technically possible that overwriting digital data will too. So you do it again and again until you can no longer see the original data underneath.

So tell me Bill, what IS the information in all 0’s or 1’s, or in completely random data. All 0’s or 1’s will give low entropy but will have no meaning, semantic information. Random data will have high entropy but will have no more semantic information than all 0’s. It will be meaningless. On the other hand all the information in the original file is gone. Can you show that the overwriting data has more information instead of just showing it has more entropy?

You can only define meaning when you know what the information represents. A file containing all 1’s could represent a graphics file that only contains white while the file with all 0’s would represent black. Those files convey information even given the very low entropy. A file with completely random data can also represent information if it contains a high resolution photographic image. That is why I said you are only changing the information in your example.

I didn’t say it has more information. You are just changing the information content of the sectors on the disk. Which brings us back to the point James was making.

Edit to add:

I should kick myself for forgetting my cardinal rule, always check the sources. And in looking at Shannon’s paper I found this just a couple of lines down from the quote you used.

Bolding by me of course.

If it contains a high resolution photographic image then the data is not random, even if it has high entropy.

If indeed the information the sender intended to transmit was a totally black or white image then that would be information.
However if the file was overwritten to destroy the information then it is only you projecting a meaning onto the contents. This is akin to finding a picture in the clouds or a word in a bowl of alphabet soup. In a random text file there will probably be an occasional string that can recognised as a word but again that is you imposing a meaning on that particular string.
However if a file “Lincoln-Gettysburg.txt” containing a copy of Lincoln’s Gettysburg Address is edited to contain a selection from Mein Kampf then it will contain information, albeit false information if the intention is to attribute those words to Lincoln.

First, this is not referring to the calculated value of entropy which is only defined later in the paper. Second, this is information about the message rather than the information in the message.

I will say again that Shannon was addressing “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.”

This actually misses both ends of the communication of information. When I have the thought about what we will have for dinner I express that thought in words and then represent those words as strings of characters. “Bangers and mash for dinner”. Only then does Shannon take over to ensure the accurate transmission of that message. We do this so naturally we can don’t even notice what we are doing. But you can see that it is coding of information because a German would produce something completely different to convey the same information, as would a Chinese.

Also Shannon is concerned with transmitting this message through an electrical communications system. I could simply speak. Abraham Lincoln’s address was delivered verbally to multiple receivers simultaneously.

Then the receiver must perform this process in reverse to get the information contained in the message. The sender and receiver must use the same coding conventions to encode and decode the message. I cannot decode a message sent to me in Chinese. Similarly you can’t sensibly open a text file in a graphics program; you might get something but it won’t be the information the sender put in the message. This is why it is nonsensical to say that a file that originally contained the Gettysburg Address could, after being overwritten with 1’s, represent a white graphics image.

The quote is from the Introduction where Shannon is discussing the purpose of the paper. The number he is talking about will be defined as entropy later in the paper.

All through the paper Shannon is talking about information. In fact he defines information as the selection of one message from the set of possible messages. He isn’t concerned with meaning but with information. Which is why people refer to information as they do even if it isn’t the way you want to define it.

1 Like

Does that mean you can have information without meaning? Isn’t meaningless information an oxymoron?

No Chris, it simply means that information has a rigorous mathematical definition. Meaning, on the other hand, does not. This is the point that you still have not addressed.

Yes. That was the point of Shannon’s paper which you refuse to see. He used information 61 times in the paper and specifically said he wasn’t talking about meaning.

No. Not when information is used in the technical sense. I can specify the information capacity of a communication channel which is carrying nothing but noise.

I suppose the roll of a die would also be an example of information without meaning. But it takes on the meaning we may assign to it, such as in determining how many places my piece can move in game. But prior to that external assignment of meaning to the result, it did not carry any inherent meaning any more than the exact coordinates of where some specific rain drop lands. “Meaning” seems to me to be a significant step beyond mere information mechanics (and beyond science generally).

1 Like

@jammycakes, @Bill_II @Mervin_Bitikofer
Noise, Data, and Information

In signal processing, noise is a general term for unwanted (and, in general, unknown) modifications that a signal may suffer during capture, storage, transmission, processing, or conversion.
Sometimes the word is also used to mean signals that are random (unpredictable) and carry no useful information; even if they are not interfering with other signals or may have been introduced intentionally, as in comfort noise.
Noise (signal processing) - Wikipedia

Data are simply facts or figures — bits of information, but not information itself. When data are processed, interpreted, organized, structured or presented so as to make them meaningful or useful, they are called information. Information provides context for data.
Data vs Information - Difference and Comparison | Diffen

Meaningless information is an oxymoron! When I searched for meaningless information the only place I found it defined was as a crossword clue where one of the suggested answers was NOISE.

Chris, once again this does not address my central point. Your understanding of the word “information” does not have a rigorous mathematical definition.

You are not using information in the sense of Shannon’s paper which you don’t seem to realize. You are using information more in the sense of knowledge. And yes meaningless knowledge would be an oxymoron.

Can you provide a rigorous mathematical definition of information? Remember that what Shannon calculated was the Entropy of a message and not a mathematical definition of the information in the message.

Can you explain to me information in the sense of Shannon’s paper?

From the paper

This one bit of information can be a 0 or a 1. That is the information. The meaning of that 0 or 1 is not what makes it information. Does that make it clear?

Have you actually read the paper?

No I can’t Chris. Not if we’re going by your definition of “information.” And neither can you. That is the whole point.

I was wrong.
In information theory there can be meaningless information. I was using the classical and common definition definition of information.
To answer the forum question “What is Information?” we would first have to ask the context and meaning of the word. This is like where the word “stress” has a particular technical meaning in engineering and a different meaning in everyday use.
Interestingly I find there is no firm definition for “Shannon Information”.

If Shannon information is what is thematized by Shannon’s theory, the agreement about all these features of the theory might suggest that there is a clear interpretation of the concept of Shannon information shared by the whole information community. But this is not the case at all: the concept of Shannon information is still a focus of much debate. [What is Shannon information? Lombardi, Holik, Vanni. 2014]

You will notice from my post above;

that in IT information can have a different meaning to the one used in Information Theory. This is the classical meaning of information…

Perhaps if I can interject to this discussion the so called information pyramid data information knowledge application. What is knowledge? And does an organism incorporate knowledge into its function?