Monday, May 30, 2005

GreaseMonkey is Cool

OK, so I mentioned earlier that Google was rewriting my search result URL's to be redirected through itself (and tracking my clicking habits in the process). Here is what the URL looks like when I searched for GreaseMonkey and tried to click the first link:

After doing a search on the web regarding this, it appears this only happens to certain locations at certain times. I tried this search on another PC in the house and it doesn't do it. So you may not see this when you search.

Now having discovered GreaseMonkey a month or so ago (and by the way BookBurro is just plain magic, check it out), I decided I would write my own script to take these URL's, pull out the address that comes after the q= part and before the & part and replace Google's link with the clean version.

After a bit of hacking around this is what I produced:

// ==UserScript==

// @name Google Link Fixer
// @author Doug Porter
// @namespace
// @description Cleans up google search result links that use a redirect through google.
// @include http://google.*/*
// @include*/*

// ==/UserScript==

(function() {
var aref = document.getElementsByTagName('a');
var href = '';
var id = '';
var beg = -1;
var end = -1;
var cleanUrl = '';

for (var i = 0; i < aref.length; i++)
href = aref[i].href;
id = aref[i].id;

if((id.indexOf('aw') < 0) && (href.indexOf('url?') > 0))

beg = href.indexOf('q=') + 2;
end = href.lastIndexOf('&');
cleanUrl = href.substring(beg, end);
aref[i].href = unescape(cleanUrl);

Crude, but works. What it does is get a list of all <a> tags. Then it walks this list looking for tags in which the id attribute does not contain 'aw' (these turned out to be adds) and where the href attribute contains 'url?' (which indicates a redirection).

Once it finds this it looks to pull out the string that falls between 'q=' and the last '&' character. Then it sets the href property of the current <a> to the cleaned version of the URL. It uses the unescape() function to make sure ? and = signs in the URL's don't get encoded.

Improvements to be made: Probably could be done more efficiently using XPath. Also very susceptible to breaking if Google changes their redirection and link format. One thing that would be nice would be to add some indication when a replacement had been done...(change the underline or coloring or something).

Well give it a try if you run into this problem and let me know how it goes, but keep in mind that this was a quick hack in a couple of minutes.

Got to find a place to store my script so I can provide it online, but I'll let you know when that happens. Till then enjoy.


Post a Comment

<< Home