Here’s an old joke about a thermos salesman explaining his product to a potential customer:
Salesman: It keeps hot things hot and cool things cool!
Customer: But … HOW do it KNOW ?!?
I often feel like this customer. Like when I’m browsing a news blog and see sidebar ads for diapers and footie jammies and talking pony dolls with long silky pink hair. “HOW DO IT KNOW?!?” I shout, thumping my fists on my forehead in amazement. Then Andy says, “Please stop. You’re scaring the kids ….” and I go for a walk to calm down.
Commercial Web sites know you’re browsing for talking pink pony dolls—for your daughter, you’d like to clarify, even though there’s absolutely nothing wrong with a grown man wanting to brush some soft silky pony hair once in a while—because these sites are collecting data from your internet connection, browser, and device. These data points are then analyzed by enchanted computer elves (which industry insiders call “algorithms” when our wife is listening and we want to impress her with fancy computer talk) to construct information, which indicates something useful about what some weirdo Daddy has been shopping for online.
My point is that there’s a difference between data and information (and also that brushing pony hair can be soothing, so maybe give it a try before you judge, okay?). Data are raw, unorganized facts, whereas information is the result of organizing and analyzing data for some useful purpose. For example, an order of diapers is a piece of data. Age and relationship status are pieces of data from the corresponding account profile. Linked together, these data points indicate that the purchaser likely has kids. A single order of three different sizes of diapers indicates at least three different kids. The elves take stuff like this and make information, which they use to target their uncanny sidebar ads.
Here’s another (simple) example: The random collection of letters …
A I K Y P S Y I R L H O N
… is mere data, but if you rearrange those letters to spell …
SILKY PONY HAIR
… you have created information. However, the line between information and random data is often not so clear, and clarifying that boundary is the subject of Information Science and related disciplines like Computer Science. It’s possible for something that looks like information to actually be just a random data event, and the difference sometimes must be proven (like in a court of law).

One early pioneer of Information Science was Gottfried Leibniz, who had some amazing silky pony hair. Just … WOW.
Here’s an illustration. It’s flu season, so imagine that you have the flu and you’re staying home alone because no one wants to be around you and your disgusting crusty booger nose, not even your favorite pony family. You get hungry, but the only thing in the pantry is alphabet soup, so you prepare some in the microwave. As you’re carrying it to the kitchen table, you look down and see this:
Is this information or a random data event? You can clearly see the word PONY in there, but the letters are all jumbled around amongst other letters, and maybe you’re just noticing that particular word because you’ve been writing about ponies so much on your brony blog lately the flu is making you hallucinate. So it’s probably just a random coincidence. Just eat your soup and go take a nap, right? But what if you looked down and saw this?
Intuitively, you recognize that this is not a random data event. You could freak out and yell “HOW DO IT KNOW?!?” until you pass out in the kitchen and wake up hours later in a puddle of cold soup. But that wouldn’t help you recover from the flu, so just take some deep breaths and analyze the situation. You might ponder some questions, like “What’s the probability of this happening?” or “Is my soup really talking to me?” or “Maybe I should lay off the pony jokes?”
If you ask that first question, you’re on the right track (and if you ask the last question, the answer is “NEVER! I’LL DIE FIRST!”) because probability and complexity are closely related mathematical concepts, and one of the two distinguishing characteristics of information is that it is complex ….