Saturday, July 7, 2007

Google's Data Retention Policy Under Scrutiny and part of a Contradiction

Not only Google but now the data retention practices of all the big search engine companies (Yahoo, Ask etc.) are being reviewed. This is mainly coming from the European community, with Spain being the latest to announce an investigation.

Generically, data retention is the storing of communication session related information. This amounts to call data records in the telephony world and transaction logs (from routers/switches) in the IP world, with the possibility of also storing things like URLs and email headers. The point of these policies is to be able to determine, after the fact, who was communicating with whom and what sites were being visited. Data Retention policies do not include storage of the actual content of the communication or the information that was viewed/retrieved from a website but they do store information for all subscribers. This type of information proved to be very useful in investigating incidents like the Madrid train bombing and the UK subway bombing.

So why would a search engine company need this info? Presumably it is used to improve the accuracy and appropriateness of searches. Not only the searches of individuals based on previous searches but also the searches of people that fall generically into similar groups. Example: I search for "limousines in Connecticut" because I live in Connecticut and need a limousine, but out of the search results that are returned I pick the company that goes to New York City and so do a majority of other people. So this information can then be used to "tune" the results of future requests for people looking for limousines in Connecticut because they are probably either going to the airports in New York or the theater district in NYC for a show.

Ok, so they are storing my information and using it to improve their product, what is the problem? Well, there are very strict privacy protection rules in place in Europe that dictate how long information can be stored, who can view and how it can be used, so advocates for the different countries are trying to balance those requirements (which may have been on the books for many years now) against the commercial needs of today's service providers.

The thing that makes this even more interesting (and here is the contradiction), there was an EU Directive passed in March of 2006 that requires all EU member states to pass specific, national legislation supporting Data Retention of telephony service providers and ISPs. It requires, among other things, the telephony service providers to store call data information for two years and ISP event data for 6 months. The deadline for passing legislation is September 15th of this year with implementations starting in March of 2008.

So while the EU community is examining practices and working with the search engine companies to reduce the amount of data retained, they are at the same time under the gun to pass legislation that requires service providers to store more information.

This is a fairly broad subject and I'll continue on this subject in my next post. Till then ...

No comments: