tsujigiri

The editorial comments of Chris and James, covering the news, science, religion, politics and culture.

"I'd take the awe of understanding over the awe of ignorance any day." -Douglas Adams

Thursday, June 12, 2003

Prediction: The Bush Administration will find weapons of mass destruction in Iraq, probably in a bunker or hole or late model Benz. Neo-cons and right-wing pundits will scream victory and, in stern, damning tones, call for everyone who ever doubted Bush to hurl themselves in front of the nearest train. Then, it will be discovered, possibly by a reporter at U.S. News and World Report or the Times or the Washington Post or by Seymour Hersh in an expose in The New Yorker that, in no uncertain terms, and without equivocation, the found weapons had been manufactured and planted by Bush administration officials, with the full knowledge and consent of Bush himself. But this information will receive limited coverage in the major TV outlets. Tom Daschle will say he is disappointed, but won't comment any further. Matt Drudge will focus on a peripheral story, perhaps about how one of the most culpable officials is openly critical of Hillary Clinton's new book. The major progressive news outlets and organizations, like Mother Jones and The Nation and the Green Party will yell as loud as they can, trying to get people to pay attention. They will probably also ask for a formal investigation by the Justice Department, and demand impeachment proceedings. And then the conservative media will kick in. Bill O'Reilly will denounce the "pro-whiners" as "insane" and ask "what more could they possibly want? WE FOUND THE WMDS! They're right there!" Ann Coulter will get her hair done and exclaim, "How dare these maniacs question the president after the weapons have been found? I'll tell you how they dare: they're pinko atheists who still haven't gotten over the thrashing Bush gave Gore Loserman in the 2000 election. They should be committed. I will personally give one hundred dollars to anyone who has their local Bush-bashing commie committed to a mental institution." Sean Hannity will wonder how "Trotskyites" can still operate out in the open "like they obviously do in the Green Party," and suggest that all those calling for an investigation "spend one evening with the family of a marine who was killed in Iraq protecting our country and helping to find those WMDs that they just can't seem to admit exist, even when they're staring them in the face." Rush Limbaugh will say, "My friends, isn't, isn't this incredible? Are you as outraged as I am that these people have the unbelievable gall to suggest that these WMDs don't really exist? I'm not the biggest O'Reilly watcher, but I have to admit that he was right when he asked, 'What more could they possibly want?' Do they want rogue terrorist groups to steal these WMDs and use them on an American city? Would that convince these traitors of the existence of the weapons? My friends, you and I both know the answer: when you're dealing with libb-ur-uhls, the answer is always, 'No'." And, gradually, people... will... just... forget... Nothing will happen. Noam Chomsky might write a book about it. Howard Zinn will write another chapter to People's History about it. Ralph Nader will point out how the reason nothing was done is because nothing was done about Iran Contra, and how that was because nothing substantially punitive was done about Watergate. And Bush will smile. Remember, the key to politics is that, bascially, no one cares.

Tuesday, June 10, 2003

Months ago, I started this blog on a whim because I was annoyed with intelligent design creationism and wanted to vent. I also wanted to talk about science-related things, especially information theory and evolution. It seems that the most popular theme on Corpse Divine right now is music (as evidence: the lively discussion of Blur vs Radiohead, compared to the tumbleweeds blowing on other posts). In this post, I plan to address all of those topics, which will either make this a really cool post or another obnoxiously over-extended one. In this month's issue of Scientific American, there's an article about tracing the evolutionary history of chain letters. The authors collected a few dozen chain letters spanning several years, and then analyzed them using what they call a relatedness measure. By measuring the relatedness of each pair of chain letters, they were able to arrange the letters into groups of common ancestry. They were also able to infer which were the oldest versions on the basis of this measure. The same method was also used to identify the common ancestry of different mammals by measuring the relatedness of their genomes. I thought it was pretty cool. The method has been extended to things like detecting plagiarism and detecting spam email. Their method works like this (you can skip this part if the math makes you sleepy): let X and Y be two files whose relatedness we want to measure. The complexity of X, written K(X), is roughly the size of the file X after it has been compressed with a good compression algorithm (like zip). The joint complexity of X and Y, written K(XY), is the size of X and Y when they have been compressed together as a single file. The relatedness R is
R = {K(X) + K(Y) - K(XY)} / K(XY)
When X and Y are totally different, K(XY) = K(X) + K(Y), so R=0. When X and Y are identical files, then (approximately) K(XY) = K(X) = K(Y), so R=1. This is because a good compression algorithm works by finding repetitive patterns and reducing them to simpler patterns. For example, if X is a long document and Y=X, then to create the combined document XY I only need to write down X, followed by an instruction to repeat everything. So the size of the compressed file XX is roughly the same as the file X. Anyway, here's what I did with it. I wrote my own program to scan through a folder and measure the relatedness of all the files in it. I used it to trace the history of revisions to a bunch of programs I wrote a few months ago. The results made it easy to spot the places where I had branched from one approach to another. I could also tell when I had eliminated a file by dividing its functions into other files -- the other files all had a 20% relatedness to the original file. I could also instantly spot the version in which I updated the coding style of an important module. It was an afternoon of unadulterated geek excitement. Then I thought, I wonder if there is any way to apply this to music or photos? My program used gzip to do the compression. This is a very inappropriate method for music files, but I thought I'd give a shot just to see what happened. Recalling the lawsuit made by Wire against Elastica, I decided to compare Elastica's "Connection" with Wire's "Three Girl Rhumba". To make the study scientific, I collected a bunch of other songs, including other Elastica and Wire songs, some Ween, Frank Sinatra, Benfold's Five, etc. As expected, the results were a bit counterintuitive. "Three Girl Rhumba" did not compare well with "Connection." Some songs which did rate highly with "Connection":
  • Various other Wire songs besides "Three Girl Rhumba."
  • The Gourds, "Gin and Juice" (a bluegrass rendition of a hip-hop song).
  • Far and away the highest rating song when judged against "Connection" was Milli Vanilli's "Blame it on the Rain".
Grounds for another lawsuit? I think so. So the method, in its current form, doesn't work well on music files (and is pretty much guaranteed not to work on photos either). But it works extremely well on text files, and on information which can be expressed as a string of letters such as DNA. The intelligent design people (and their variously named predecessors) love to claim that "intelligent design" is something which can be inferred. Their methods are always vague and their arguments bogus. Relatedness measures provide precise, well-defined scientific tools which can be used to infer common ancestry on the basis of information theory alone. This method allows us to actually measure evolution. In so doing, it provides an elegant demolition of intelligent design theories.