« Seth suffering from short attention span | Main | All your page are belong to us »

Google and the Law and ... um ...privacy

I see that Google has posted their official response to the request for a "million URL's" by the department of Justice. You can read it here. They argue three main points, that the data as requested is useless, that it may expose google trade secrets and that it would be too much work for Google to supply a million "random" URL's. Privacy concerns and "chilling effect" are thrown in for good measure, but it appears to me that these are more in support of Google's business model which is based on the perception of user privacy. While I remain unconvinced that there is not some bit of evil lurking in the heart of Google, this is generally a good thing.

I'm most interested in writing about it from an analytical point of view. It is nice to see a case where "data" is held out as NOT being the answer to a question. This is not to say that data is useless - no one would argue that, but it is a clear statement that a particular set of data may not be suited to a particular purpose. In this case, crafting a law based on search results just seems to be a bad idea. Here is what they say about it:

"First, the Government's presentation falls woefully short of demonstrating that the requested information will lead to admissible evidence. This burden is unquestionably the Government's. Rather than meet it, the Government concedes that Google's search queries and URLs are not evidence to be used at trial at all. Instead, the Government says, the data will be "useful" to its purported expert in developing some theory to support the Government's notion that a law banning materials that are harmful to minors on the Internet will be more effective than a technology filter in eliminating it.
Google is, of course, concerned about the availability of materials harmful to minors on the Internet, but that shared concern does not render the Government's request acceptable or relevant. In truth, the data demanded tells the Government absolutely nothing about either filters or the effectiveness of laws. Nor will the data tell the Government whether a given search would return any particular URL. Nor will the URL returned, by its name alone, tell the Government whether that URL was a site that contained material harmful to minors."

Earlier you may have caught that I feel the privacy thing is gratuitous and perhaps a bit ingenuous and to see why I believe this, here is a sample entry from my logs today:

xx.xxx.137.74 - - [22/Feb/2006:10:25:50 -0500] "GET /blog/archives/pmi-and-pmp/pmp-exam-cheats.html HTTP/1.1" 200 13096 zo-d.com "http://www.google.co.uk/search?hl=en&q=PMP+cheat+test+answers" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; {0EF9B069-A48C-18A5-1EEF-88AC09646F5E}; .NET CLR 1.0.3705; .NET CLR 1.1.4322)"

I x'ed out the IP address, but this is typical of what shows up in my logs when someone arrives here from a Google search. The first part is the IP address, then date, the page they are directed to here, some data about http response and size, and then the search itself. This is followed by browser identification. A simple lookup of the IP address shows that it comes from inside one of the big computer manufacturing companies. The fact that google passes along the search terms when it refers a user to my site is great for me. I use it to understand what people are looking for when they arrive here and occasionally write things which respond to those sorts of requests, but since the IP address of the user is passed along too, it is not particularly private.

Most people ending up here are looking for things they don't need to keep to themselves, but if my content were a bit more shady I can imagine that I'd be getting a lot of information from google about the dark side - information that includes where that person is on the internet. This sort of information is not what the government should be using to fish for new ways to make laws, but it is hardly the hallmark of privacy protection.



The previous article is Seth suffering from short attention span.

The next article is All your page are belong to us.

Current articles are in the main index page and you can find a complete list of articles in the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34