How it know, part 2/7 …. | By the Hearing of Faith

Let’s recap:

1. Elves watch what we do so they can better sell us things online.
2. Probability and complexity are closely related concepts.
3. Sometimes it seems like our food is trying to talk to us.
4. Information differs from random data in part because it’s more complex than random data.

(Also, Leibniz was a masculine manly man’s man … with shiny, full-bodied pony hair. And you can go ahead and admit that you soooo want to brush it right now. I won’t judge you.)

Moving on: Andy and I were watching our neighbors’ house, and they had given us the 6-digit access code to their garage door. Because we are such diligent and caring neighbors, we ~~immediately forgot the code~~ kept the code in a super-safe place, and Andy wondered in passing how feasible it would be for a burglar to randomly guess the code. I explained that there were 1,000,000 possible access codes (0-9 for 6 digits = 10^6 possible codes). The odds of a potential burglar randomly guessing the access code on the first attempt are 1×10^-6, or 0.000001, which are very remote. We then wondered how long it would take to arrive at the correct code by just entering 000000, 000001, 000002, etc., until the door unlocked—what’s known in computer security as a “brute force” hack. I can’t do that kind of math in my head, but I later ~~looked it up on the Internet~~ determined that if the potential burglar had one second to enter each digit, one second to press the Enter key, and one second to evaluate whether or not the combination were successful, it would take up to 92 days of non-stop data entry to stumble on the correct combination: 8×1,000,000=8,000,000 seconds, 8×10^6/(60x60x24)=92.593 days. Our neighbors would only be away for one week, during which time someone would surely notice a terribly sleep-deprived burglar trying to illegally enter through the garage. We concluded that our neighbors’ ~~vast collection of pony brushes~~ property was therefore safe.

A walking 17th-century shampoo commercial: “Don’t hate me because I’m beautiful ….”

The length of the garage access code and the set of symbols to choose from are matters of complexity; the odds of guessing that access code at random is a matter of probability. But these are like two sides of the same coin: The high complexity of the access code is also an indication of the low probability that it can be randomly guessed, and the high complexity of the soup message about Leibniz’s magnificent shimmering pony hair is also an indication of the low probability of its occurring at random. (In fact, they’re not just indications of each other; the two measures are directly mathematically related, but I won’t go into detail about that here because ~~I’m way too lazy~~ that’s a whole other discussion.)

It turns out that complexity, while being a necessary condition for distinguishing information from random data, is not sufficient to do so. To demonstrate why, pretend you’re merry go-lucky Gottfried Leibniz, and you have a decision to make: You slept in too late this morning, and now you have a bad case of bed-head, but you need to update your notes so that Isaac Newton can’t convince everyone that he invented differential calculus before you did. So you decide to toss a coin. If it lands heads-up, you tend to your magnificent pony hair; if it lands tails-up, you update your notes and stay away from the mirror. You’re assuming that the odds of the coin landing heads-up is 50% (1 in 2) because there are only two outcomes: heads or tails (H or T). You flip the coin, and it lands tails-up. Crestfallen, you decided to give the coin a chance to reconsider. You flip the coin again, and it lands heads-up. Now you’re encouraged, so you decide to flip again for the best of three. Bad news: it’s tails. So you flip again ….

Consider that third coin flip. What’s the probability that three consecutive coin flips would result in THT (tails/heads/tails)? Again, it’s based on the number of possible outcomes, and there are eight: HHH, HHT, HTH, HTT, THH, THT, TTH, and TTT. So the probability of THT is 1/8. You can greatly simplify the math here by applying the knowledge that each coin toss is a 1/2 probability and multiplying each: (1/2)^3 = 1/8. But what you really want to do is brush your shiny pony curls, so after 20 coin tosses (result: THTHTTTHHHTTHHTTHTHH) you give in to vanity.

The probability of 20 coin tosses giving an outcome sequence of THTHTTTHHHTTHHTTHTHH is (1/2)^20, or 1/1,048,576. That roughly corresponds to the probability of guessing our neighbor’s garage-door access code on the first try and also to the complexity of that access code. The point here is that somewhat random data events can have high complexity (even though chance is not the causal agent in a coin toss; it’s actually a mind connected to a thumb, but that’s a distinction I won’t pursue here ~~because I’m still too lazy~~ for the sake of brevity). Some additional criterion is needed to fully distinguish information from a random data event, and that criterion is specification ….