Patch included in the ticket.
Rich
Patch included in the ticket.
Rich
Something sensible needs to be done to auto_link_urls. It's a hideous regular expression which no one can hope to read, hacked around over the years to add characters here and there but there's always something it doesn't handle. Could we not rewrite this from scratch starting from the grammar given in the RFC on urls (or some other more complete solution than lobbing a character in here or there)?
Fred
Something sensible needs to be done to auto_link_urls. It's a hideous regular expression which no one can hope to read, hacked around over the years to add characters here and there but there's always something it doesn't handle. Could we not rewrite this from scratch starting from the grammar given in the RFC on urls (or some other more complete solution than lobbing a character in here or there)?
Indeed I think it's about time to consider rejigging how all that code functions. Perhaps it's worth looking into adopting addressable or something similar? The URI library from stdlib has some nastiness with it's parsing and matching code.
Care to take a look into it?
Something sensible needs to be done to auto_link_urls. It's a hideous regular expression which no one can hope to read, hacked around over the years to add characters here and there but there's always something it doesn't handle. Could we not rewrite this from scratch starting from the grammar given in the RFC on urls (or some other
more complete solution than lobbing a character in here or there)?Indeed I think it's about time to consider rejigging how all that code functions. Perhaps it's worth looking into adopting addressable or something similar? The URI library from stdlib has some nastiness with it's parsing and matching code.
Care to take a look into it?
Don't see why not. I'll see what approaches I can dig up.
Fred
The problem is that URLs allow much more characters that people actually want to be autolinked with the URL. Common pitfall is the regexp capturing the punctuation after the URL in a sentence.
It’s all been nicely described on Coding Horror recently: http://www.codinghorror.com/blog/archives/001181.html
And more recently, there are URLs – even domain names – that include Unicode characters, not that browser support for it is on the rise.
I’d say: let’s autolink everything that starts with “https?://” or “www.” up to the first whitespace or punctuation character before the whitespace if that character isn’t a closing parenthesis or bracket that has a matching one on the beginning.
Whitelisting characters will always leave someone out, I’m afraid … But my solution is just a quick thought, there may be tons of cases where it wouldn’t be appropriate.
Typo. I meant to say “…now* that browser support for it is on the rise”
Could we not rewrite this from scratch starting from the grammar given in the RFC on urls (or some other more complete solution than lobbing a character in here or there)?
The problem is that URLs allow much more characters that people actually want to be autolinked with the URL. Common pitfall is the regexp capturing the punctuation after the URL in a sentence.
It's all been nicely described on Coding Horror recently: http://www.codinghorror.com/blog/archives/001181.html
That is indeed very timely!
And more recently, there are URLs -- even domain names -- that include Unicode characters, not that browser support for it is on the rise.
I'd say: let's autolink everything that starts with "https?://" or "www." up to the first whitespace or punctuation character before the whitespace *if* that character isn't a closing parenthesis or bracket that has a matching one on the beginning.
Whitelisting characters will always leave someone out, I'm afraid ... But my solution is just a quick thought, there may be tons of cases where it wouldn't be appropriate.
I think it's certainly in the right direction.
Fred
Previously hideous regexp becomes this: %r{( https?:// | www. ) [^\s<]+}x
Added feature: intelligent closing bracket handling. You are now able to write this:
I’ve read an interesting article about
sprites (link: http://en.wikipedia.org/wiki/Sprite_(computer_graphics))
All this in less code than before.
Just as an FYI - this is awesome. People have always bitched at me about the auto-linking stuff in my apps, and this looks like it solves all their problems.
Thank you! - Trevor