Information Theory X Psychology

THE tool to understand and measure a complex world driven by information blurred by artificial intelligence.

Information theory, a field of study that emerged from the need to improve communication systems, has found applications beyond its original scope. With its focus on probabilities, statistics, and the quantification of information, information shows potential in shedding new light on age-old questions in psychology and philosophy. In this essay, I will cover some of my favorite re-interpretation of psychology and philosophy with the language of information theory.

Intro to Information Theory

Information theory quantifies the number of symbols needed to represent any outcome of a probability distribution. For instance, there are 2 different outcomes from tossing a fair coin. There is a 50% chance it is heads, and a 50% chance it is tails. If you wanted to communicate the outcome of a fair coin, you would need 2 symbols, one for "heads" and one for "tails". This can be achieved with a "bit" that has two symbols "1" and "0" that we map to "heads" and "tails". If we wanted to record and save the outcome "heads, tails, heads" we could write down "101" using whatever tools are available. If you want to communicate an event from a different distribution, such as a fair dice roll (6 outcomes), you would need 6 different symbols. We could then use 3 bits to create these symbols. "001" could be 1, "010" could be 2, "011" could be 3, "100" could be 4, "101" could be 5, and "110" could be 6. This is a very simple example, but it shows how we can use bits to communicate events from a probability distribution.

There are also ways to encode events from a probability distribution that are more efficient than the naive approach above. For instance, the symbols "111" and "000" are unused in the above example. We could use these symbols as a shortcut when communicating a large amount of dice outcomes. "111" could mean "next two are same as the last". So if we are lucky enough to get: 3, 3, 3 as our dice rolls, we can encode that with just "011", "111" which would be shorter than "011", "011", "011". This is a very simple example of compression, of which there are two types: lossless compression and lossy compression. For the purposes of this essay, I will not be explaining the difference between the two.

Psychology with Information Theory

While information theory is used heavily by computer scientists to compress data and transmit information, I think it is also super interesting to think about how it relates to psychology. Our brains are information processing machines, they just uses different techniques than computers. Instead of tiny transistors, they uses neurons. Instead of communicating with bits over a wire, we generally communicate with words. Fundamentally, our brains do a ton of pattern recognition. They are constantly seeking familiar patterns to orient themselves, and constantly looking for new patterns to learn from. Let me give an example of how our brains use patterns to orient themselves and find objects. This example is the task of finding your car in a parking lot after doing some grocery shopping. When looking for your car in the parking lot after grocery shopping, you would first orient yourself based on the door you entered/exited. You may even use some large landmarks to judge orientation and distance. If you came out of a completely random exit, you would be extremely confused. After you have oriented yourself, you start the search for your exact car. Color and shape being the most salient features to search for. If someone was to paint your car while you were gone, you would find it based on model and the personal items in the car itself. Here, you are using a number of patterns to determine which car is yours, most of which you do not expect to change while you are in the store.

Patterns determine what is real

If someone was to take your old car, and replace it with a identical one in all the ways you remember it, you would not notice. Maybe the undercarriage is new and shiny, but since you generally do not look at the undercarriage, you would never know. This example shows that answering "what is real?" is not as simple as it seems. It isn't "really" your car, but since it is indistinguishable from your car, it is effectively your car.

When applying this same concept to friends and family, you quickly enter a black mirror episode. Some of the most painful parts of loss are not when the person actually passes on, but when you can no longer recognize them. When the brightness of their personality fades, and the patterns that made them your companion are no longer there. The phrase "you have changed" reflects that a persons pattern of behavior is very important to how we relate to each other.

While information theory cannot currently replace loss, I believe that information-first thinking can help us better communicate what we are feeling and experiencing.

Clear channels of information extend our senses

"Media" is our term for "mass communication", or as viewed through information theory "processed, packaged and reproducible information" and is used to influence others, which changes what they experience. Text, images, sound, and video are used pervasively to augment our experience. We can get so invested in a TV show or Movie that we forget the world around us, and our emotions are controlled by a machine in our living rooms. I mention this phenomenon to explicitly recognizing that we can deeply experience something that is highly processed. However the benefit of this is that a lightly processed and high bandwidth channel of information can effectively extend our senses.

When viewing a live recording, such as a surfing competition or soccer match, we can experience events happening much farther than our senses can reach. The better the channel of information, the deeper the experience. A very grainy webcam will not be very immersive, but a 4k recording will be. I consider this the "clear channel" theorem of experience.

Associations separate "seeing" from "experiencing and understanding"

However, just because a channel has high information content, does not mean it can be understood or experienced. For me, this is most evident in languages. Reading a cookbook in your native language will be very efficient and instructive, but if you are given one in foreign language without translation, it may as well be a paperweight. The problem lies with decoding the information, not with a lack of useful information in the foreign cookbook. Unfortunately, your brain hasn't developed the pattern recognition and associations to understand it.

The term "associations" used above accounts for the difference between the experience of "seeing" and the experience of "understanding". By focusing on the associations made in a information processing system, you can better understand what it would feel like to be that system.

Personally, I find associating real-world phenomenons with mathematical explanations to be deeply enlightening and exciting, and I hope I have helped others like me. You may soon find yourself asking lots of questions like:

Cheers,
David Bernadett

Published: 05/07/2023