Regular expression: How do I allow forward slashes?

In my app, I allow users to submit urls. They (of course) need the ability to submit urls with a forward slash, "/", but whats the regular expression to allow them to do that?

I currently use:

validates_format_of :url, :with => /^[-\w\_.]+$/i

to only allow alphanumerics, dashes, underscores, and dots to prevent cross site scripting when I later reconstruct these urls, but I can't figure out how to allow "/" as well.

Any ideas?

am a newbee.. but i think u can use underscore method for the whole url that will put '/' instead of ' : '

Have you tried escaping them "\/"?

frogstarr78 wrote:

Have you tried escaping them "\/"?

Another way would be to use %r, that way you can avoid the leaning toothpick syndrome alltogether;

/^http:\/\/myhostname\.com\/foo$/i

would become

%r{http://myhostname\.com/foo}i

But before you start piecing your own regexp together have a look at the regexp patterns in the URI::REGEXP::PATTERN module (in your ruby lib directory under uri/common.rb). Could save you some work depending on what and how you want to validate.

Sven

Sven Riedel wrote:

/^http:\/\/myhostname\.com\/foo$/i

would become

%r{http://myhostname\.com/foo}i

And of course I forgot the anchors in the second example. So the correct version is:

%r{^http://myhostname\.com/foo$}i

Use Ruby's other regexp syntax:

  %r{pattern}

To continue your example below:

  validates_format_of :url, :with => %r{^[-\w_./]+$}

AlwaysCharging wrote:

Yes, that did it. Thank you. No idea how I try everything and overlook the simplest solution, duh.

And, Thank you to everyone else that weighed in as well, definitely some other options to look into.

Side note: Anybody know why the period doesn't have to be escaped? Like just "." allows the dot to be input, as well as "\." So, [-\w\_\.\/] works just as [-\w\_.\/]. Why is this?

It actually depends on where the "." is in the Regexp. In your case it is inside a Character Class "". So it is matching the "." character explicitly. Since \w is shorthand for the [a-zA-Z] character class. It is parsed as a character class instead of an escaped "w" character. So you could actually change the character class to be %r|[-\w_./]|. No need to further escape the "_", or "-" since it is at the beginning of the class (That's for another reason though).

But that made me question why I couldn't just put the "/" inside of the bracket as well. Like why did that have to be escaped if the period didn't. (I guess it's because in that syntax, the forward slash has closure properties.) Oh well it's working now, and I escaped the . as well (\.). Thank you for your help. Much appreciated frogstarr

Yeah, I understand what you mean. No worries.