scanning for links

You might want to look into hpricot: http://code.whytheluckystiff.net/hpricot/ I’ve found it useful for things like this.

RSL