I think I actually covered this In my previous post but I will risk repeating myself.
Claude Shannon in 1948 was addressing a technical problem in communications and was using the word information in that context and within the conventions of that time. He was not trying to measure information itself but only the size of the message that needed to be sent via the communications system. He said specifically that "These semantic aspects of communication are irrelevant to the engineering problem.
In his paper Shannon referred specifically to “7. THE ENTROPY OF AN INFORMATION SOURCE” but unfortunately over time this has come to be called Shannon Information, a misnomer.
Consider the analogy of sending a parcel by a courier company that charges by volume, the size of the parcel. They measure the length, width, and depth and calculate the volume. The calculated volume provides information about the size of the parcel you are sending, but volume is not the parcel nor the contents of the parcel. Neither is volume in general information about the parcel but only the specific calculated value.
Similarly Shannon Entropy is a measure of the size of the message. It provides information about the message but it is not the message nor is it the information within the message. Hence it is invalid to equate Shannon Entropy with Information as Swamidass does in his post.
Now if you want to talk about what information actually is that is a whole other subject and beyond my pay grade. I did find Werner Gitt’s book “In the Beginning Was Information” helpful in in this regard.
Unfortunately @Swamidass has advised me that he is not able to respond in this forum.
[edit]
The message itself is not information either. The information is coded in the message. I might start with the thought about what we will have for dinner tonight. That thought is coded into English words which are represented by strings of characters which make the message, “Bangers and mash for dinner tonight”. My wife gets the message and decodes the letters to words to information. She might then reply “What, again?”
The encoding and decoding steps can go wrong if the sender and receiver use different coding systems; like when I downloaded some product information and found it was in German. The information had been encoded into the message and transmitted successfully but I didn’t have the correct system to decode it back into information. I had this problem in Germany last year when trying to order a gluten free meal in a restaurant. I could encode the request into English, transmit it successfully, the hearer received (heard) the message but could not decode it. In the reverse direction I could receive but not decode their messages. (We ended up with a meal of spargel.)
A message can be encrypted and it will then appear as gibberish or random data because we don’t have the correct decoding system. This is not the case with random noise which does not have information encoded by any system. This is another reason why Swamidass is wrong when he says " Another surprising result is that the highest information content entity is noise , exactly the opposite of our intuition of what “information” actually is."