As of June 12, 2009, analog television transmissions will be a thing of the past, replaced by digital broadcast television (DTV). (Some stations have begun the transition earlier.) This transition will allow more efficient use of the broadcast spectrum, permitting more kinds of simultaneous uses with higher average quality.
As all of this is going on, I find myself pondering the problem facing the SETI project as it listens to the skies for signs of extraterrestrial life.
The ordinary assumption is that once a civilization reaches a certain level of sophistication, it begins to pump out signals that would identify the civilization as technologically capable. The popular movie Contact brings this concept to life by showing how television broadcasts from the Earth might be received by other civilizations and responded to.
But the odds may be much worse than conventionally acknowledged because if other civilizations follow our lead, immediately after the development of transmission capabilities, they'll probably move from analog to digital, and probably compress that digital signal in the process or shortly thereafter. You might think that a compressed signal would be harder to notice, and you'd be right. Obviously, if there's less data being transmitted, you have to look harder for it. But it's worse than that.
The way compression works is that it removes redundant information. For example, let's consider a simple picture:

One way to represent this picture is to divide it into a rectangular set of dots and to call out the colors of each moving from left to right, top to bottom. This is how the BMP file format works.

So, absent compression, this picture would be stored as WHITE, WHITE, WHITE, WHITE, ... for a long ways because the first eight lines of 52 dots are white. Then finally there is a line that has some blue in it, starting with the eighth dot: WHITE, WHITE, WHITE, WHITE, WHITE, WHITE, WHITE, WHITE, BLUE, BLUE, BLUE, ... And after 11 blue dots, it goes back to white again. When we apply compression, we basically use techniques to try to consolidate things that don't need to be said multiple times. There's quite a lot of variation in that, but a simple compression scheme might just allow us to just consolidate series where the same thing occurs repeatedly. For example, we might store this in the file: 423*WHITE, 11*BLUE, etc. (meaning 423 occurrences in a row of WHITE, followed by 11 occurrences in a row of BLUE, and so on). You can see how it would take a lot less room to write the data this way.
In fact, compression techniques can be much more clever than this. They can, for example, contain dictionaries of small patterns that occur a lot. They might for example notice intricate patterns and store whole dictionaries of them in the picture. So a file describing a more complicated picture than the one shown above might contain an encoding like: “Define PATTERN1 as WHITE, BLACK, 2*GREEN, BLUE. Define PATTERN2 as 5*RED, 2*PATTERN1, 3*BLUE, WHITE. Now do 23*PATTERN1, 37*WHITE, 14*PATTERN2, 5*PATTERN1.” (This kind of sophisticated compression is sort of like what the GIF file format does.)
It's important to note, too, that human beings rely heavily on redundancy in order to recover from various kinds of difficulties they encounter. For example, if you almost hear something someone says, you can often pretty much fill in the details you miss from context. If there's a smudge on a document, you can usually figure out what was obscured by what was near by. Redundant context helps you do that by essentially providing the same information more than once. For example, if you saw: “WHITE, WHITE, WHITE, ???, WHITE, WHITE, WHITE, WHITE, BLUE, BLUE, BLUE, BLUE, WHITE, WHITE, WHITE, ...” you would probably conclude that the missing element was “WHITE,” but if you saw “8*???, 4*BLUE, 3*WHITE” it's harder to make a guess since the adjacent elements are probably not a predictor of the color of the missing element.
Extreme compression, by contrast, seeks at all costs to avoid wasteful redundancy and instead goes for only the most concise way of expressing a pattern. In doing so, each detail of the description will be unlike the item next to it. That's because if the next thing in the file could be predicted from the previous, it would be an opportunity for further compression. So the process of compression is repeated until there is no way to predict the next thing from the previous. In effect, the contents of the file will be such that “each bit is a surprise.” This is, in effect, the very definition of randomness.And randomness is exactly what the SETI folks fear. They assume they are looking for patterns. But since sophisticated data compression reveals no patterns unless carefully unpacked, there will be no patterns to be seen. And so the SETI people could be seeing the compressed data transmissions of alien television broadcasts and not even know it.
In my mind, all of this boils down to a simple but sad truth: The SETI people are probably up against an even worse problem than they thought because they aren't just trying to detect a rare extraterrestrial civilization beyond a certain point in its evolution (which might imply many thousands or millions of years). Rather, they are trying to detect such a rare civilization during a few decades of its existence, after it discovers radio or TV, and before it discovers compression—a much tighter time window, making it much more likely that we could have already missed our chance, or that if we merely blink at some point in the future, we'll have missed our chance.
If you got value from this post, please "rate" it.


Salon.com
Comments
The whole thing both thrills and terrifies me. Which usually makes me think we have to keep at it, despite the possibility of it never working.
But I thought I once read that the SETI group’s ‘listening’ algorithms were tweaked in an attempt to uncover purposefully compressed signals, no?
Rob, thanks—and thanks for reading a preview draft for me. I added some extra explanatory text about redundancy in response to your early comments. :)
Odette, yes, it's true that they might also try transmitting, and if they do that they might pick a better code. It's a curious question what the odds they'd do outreach like that. It might depend on the politics and economy/budget of such a civilization.
I know these sorts of communications ideas are hard to explain, because I've tried (and sometimes failed) myself, and I've seen professional story-tellers who should know better make basic mistakes. For example, Jack Vance, one of my favorite science fiction writers, describes a little communication device that compresses a message and sends it in a very short burst, to prevent interception--without realizing that without previous arrangements, it becomes equally hard for the intended recipient to realize that there's an incoming message.
One thing that helps, when I'm going over some very introductory issues in my classes, is to contrast tallies with Arabic numerals for tasks such as counting, comparison, addition, subtraction, and so forth. It's surprising how effective tallies can be under some assumptions (e.g., addition is just concatenation), but of course they break down when numbers get too big. Hmm, a position-based number system like Arabic numerals seems to provide a useful amount of compression...
Cool stuff.
I just threw this out for fun as a weekend thought exercise, and perhaps should have disclaimed which are and are not my specific areas of expertise. (Even digital signals presumably have to get an analog foothold in order to build up the digital abstraction, and maybe that's all the SETI people care about.)
But the general thing that I think is still true regardless is that we often make lists of all the things that could go wrong in our assumptions about what an alien civilization might or might not due, but it seems easy to have overlooked something major like this (even if not this specifically) that would throw our estimate of the probabilities way off... usually in the direction of making the problem harder, I fear, though maybe I'm overlooking something. :)
I was going to point out, but didn't want to get too longwinded, that with a couple hundred thousand words in English, we could get away with all words having four or fewer letters if we didn't mind the lack of redundancy. Allowing words to be longer leads to easier pronunciation and easier ability to notice typos; when the space gets too densely packed, it's hard to do that.
(In fact, spell checkers used to be hard to have in computers because of the space they took up, but there was a fascinating one that was used in the early 80's at Yale when I was there--I'm not sure who wrote it. It took every valid English word in a 50,000 word dictionary and compute a hash code in the range of 1 to 100,000, setting a bit to 1 if the word is present and 0 if it's not. Then if you take a given word and hash it and find a 1, it might be a valid word or it might not, since the hash loses information. But if it's a 0, it's definitely misspelled. This can be implemented in about 10K bytes, about 100K bits, and the program itself is of negligible size. It was a brilliant use of compression to get a heuristic spell check that could reliably tell you certain words were misspelled even though it would miss other words. That funny little typein mode on cell phones that just guesses as you type digits works on a similar theory and is also a cool use of compression.)
Oddly, what this: "Now do 23*PATTERN1, 37*WHITE, 14*PATTERN2, 5*PATTERN1" reminds me of is crochet patterns. They operate in a similar way.
So -- now that we're going all to digital, are we continuing any kind of sending of analog-type signals into space? Is it too much to hope that any civilization that does make a switch from analog to digital would also consider these problems of transmission and therefore change their sending (and looking) habits?
Thanks for this post Kent, it was interesting to think about!
Stewie, glad I could make the basic concept intelligible. I was a little worried people would just glaze over. There's a bit of art in the compression programs for finding the patterns to use, but the actual storage and unpacking works almost exactly like I described. It's remarkably simple and elegant, actually.
http://open.salon.com/blog/catamitebastard/2008/12/31/so_jerk_you_finally_figured_it_out
And this is interesting in Scientific American currently on the posibility that there is a very vast amount of life out there:
http://www.sciam.com/article.cfm?id=habitable-planets-crowded-universe
Fortunately there is hope that if they have been trying to contact us, we have not been able to detect them. In 1992 it was discovered that photons have orbital angular momentum (independent from angular momentum in polarization). This allows advanced civilization pack much more information to their em-signal. It's possible that radio spectrum is full of easily readable hello messages we did not notice. We just need to build telescopes that can read orbital angular momentum.
http://en.wikipedia.org/wiki/Turbo_coding
http://en.wikipedia.org/wiki/Spread_spectrum
http://www.intuitor.com/statistics/CellPhones.html
http://www.physics.gla.ac.uk/Optics/play/photonOAM/
I wasn't familiar with the Turbo encoding, though; that looks quite interesting. It's late, so I'll have to read up on the photon stuff another night. Anyway, thanks for the cross-references. :)
Seriously, still loving this post mostly because of what it tells us about people in this world.
Do love the SETI stuff though. Even if there is no chance of success, it's one of those very few things actually worth the effort. Just in case.
It's not a matter of receiving signals, it's a matter of taking already-received recordings of the sky and having your computer sift them looking to see if maybe there's a signal in there. I'm pretty sure you can still do it, though my household's computers fell off the grid when they switched over to the newer technology some years back so I'm a little out of touch.
Much as I love SETI as a concept, there are also similar webs of computers working on other big problems, many of which need results more urgently. See BOINC where you can sign up to have your computer involved in curing diseases or modeling global climate change. Those efforts are, I suspect, much more likely to be good uses of your computer's resources.
It's not just the intelligences that we see as closest to our own (elephants, whales, squid), but all intelligence (spiders, worms, amoeba). Like this post on hermit crabs: http://open.salon.com/blog/somyr_perry/2009/02/26/my_life_with_crabs
I had a hell of a time convincing my parents, that my hermit crab had a personality and actually engaged with me. They thought I was anthropomorphizing. My rats were the same way, but anyone who's had rats knows how loving and bright they are. I knew a man (friend of my dad's) who had a spider who refused to eat because he had to put it in the zoo because he was transferred to a different country. She died from starvation/grief, and they were feeding her the same food, and she was living in the same circumstances, except for his presence. We have life forms that share much of our dna (and I'm guessing that because of that we will share more features that allow for more cross connections- parallels? to be made) right here and we are losing species right and left because we are still competing with them for resources. I wonder if there is any other group, like SETI, that looks at connecting/decoding the intelligences we already have met?
Signing up for http://spin.fh-bielefeld.de/ right now, never heard of them before, only of Seti@home. Thanks for connecting us to these resources!
Kent, I wonder how much we'd have to anthropomorphize a higher/alien life form in order to understand it? It probably is inevitable in either direction. I only know that the man that told the story was utterly convinced of his spider's emotions. :) That probably means squat- but I got told this story as a 7 yo, so it's always been there humming in the background. I feel the same way about trees; laugh, and planets, I think planets are alive, too. Just ignore me :D sometimes my whimsy dictates my belief systems more than my logic, cause I do give it fairly free reign. In my mind, more complex does not mean more important, but I've been wrong about more things than I've been right. ;)
no, I absolutely agree with that part- I've seen human bodies shut down, there is nothing magical about consciousness to me