Is apostrophe (') something special in a regex if at end?

(Ruby 1.9.2) I have a simple validation regex which I need to pass the following values: “Billy-Bob” and “O’Kelley” (as test cases). Originally I was not allowing apostrophe but it became apparent I had to allow it.

The initial regex was:

/^[a-zA-Z -]*$/

Now, when I added the apostrophe like this:

/^[a-zA-Z’ -’]*$/

Then for some reason “Billy-Bob” was not getting matched:

“Billy-Bob” =~ /^[a-zA-Z -’]*$/

nil
“O’Kelley” =~ /^[a-zA-Z’ -’]*$/

0

But when I moved the apostrophe further in, then things work as desired an expected:

“Billy-Bob” =~ /^[a-zA-Z’ -]$/
0
“O’Kelley” =~ /^[a-zA-Z’ -]
$/

0

Why is this?

(Ruby 1.9.2) I have a simple validation regex which I need to pass the
following values: "Billy-Bob" and "O'Kelley" (as test cases). Originally I
was not allowing apostrophe but it became apparent I had to allow it.

The initial regex was:

/^[a-zA-Z -]*$/

Now, when I added the apostrophe like this:

/^[a-zA-Z' -']*$/

Then for some reason "Billy-Bob" was not getting matched:

"Billy-Bob" =~ /^[a-zA-Z -']*$/
nil
"O'Kelley" =~ /^[a-zA-Z' -']*$/
0

But when I moved the apostrophe further in, then things work as desired an
expected:

"Billy-Bob" =~ /^[a-zA-Z' -]*$/
0
"O'Kelley" =~ /^[a-zA-Z' -]*$/
0

Why is this?

It is the minus that is a special char (as in a-z) if you escape the
minus it is ok.
ruby-1.9.2-p0 > "Billy-Bob" =~ /^[a-zA-Z \-']*$/
=> 0
ruby-1.9.2-p0 > "O'Kelley" =~ /^[a-zA-Z \-']*$/
=> 0

Colin

(Ruby 1.9.2) I have a simple validation regex which I need to pass the

following values: “Billy-Bob” and “O’Kelley” (as test cases). Originally I

was not allowing apostrophe but it became apparent I had to allow it.

The initial regex was:

/^[a-zA-Z -]*$/

Now, when I added the apostrophe like this:

/^[a-zA-Z’ -’]*$/

Then for some reason “Billy-Bob” was not getting matched:

“Billy-Bob” =~ /^[a-zA-Z -’]*$/

nil

“O’Kelley” =~ /^[a-zA-Z’ -’]*$/

0

But when I moved the apostrophe further in, then things work as desired an

expected:

“Billy-Bob” =~ /^[a-zA-Z’ -]*$/

0

“O’Kelley” =~ /^[a-zA-Z’ -]*$/

0

Why is this?

It is the minus that is a special char (as in a-z) if you escape the

minus it is ok.

Yes I see you are right —the weird part is that the minus is getting passed not as a special character in the first examples ( “Billy-Bob” =~ /^[a-zA-Z’ -]*$/ returns 0!) … anyhow, I will remember than and start escaping it.

Yes I wondered about that. Either it is a bug or a documented feature
that - does not need to be escaped in some circumstances. Or perhaps
[a-zA-Z -] means something that neither of us understands.

Colin

The minus (hyphen) in a charset is un-special if it is at the beginning or the end. You’re better off escaping it yourself for exactly the reason you encountered – adding another character to the end changed the meaning of the regular expression (charset) in a way you didn’t expect.

-Rob

Rob Biedenharn

Rob@AgileConsultingLLC.com http://AgileConsultingLLC.com/

rab@GaslightSoftware.com http://GaslightSoftware.com/

It is the minus that is a special char (as in a-z) if you escape the
minus it is ok.

Yes I see you are right —the weird part is that the minus is getting passed not as a special character in the first examples ( “Billy-Bob” =~ /^[a-zA-Z’ -]*$/ returns 0!) … anyhow, I will remember than and start escaping it.

The minus (hyphen) in a charset is un-special if it is at the beginning or the end. You’re better off escaping it yourself for exactly the reason you encountered – adding another character to the end changed the meaning of the regular expression (charset) in a way you didn’t expect.

Makes sense, thanks