RegEx help to detect First and Last name

Hello,

I need some help on RegEx to detect First and Last names. This is what I currently have: /([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/

This is used to detect a First and Last name where two words are next to each other that begin with a capital letter. So it will detect: John Smith Jane Smith

I run into problems where the name is close to the beginning of the sentence: Having John Smith over for dinner. --- This will look at "Having John" Getting Jane Smith ready for school. --- This will look at "Getting Jane"

Do you know how to do a RegEx where it will ignore the first word whenever three capitalized words are next to each other? Thanks!

-A

first you have to check whether there is three capital words are there or two..

if str.match(/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/)   # Do something

elsif str.match/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/

  # Do something end

I hope this will help u..

Thanks

Brijesh Shah

That's close. You want something like

/\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)/

Which gives you

irb(main):021:0> x => "Having Jane Smith" irb(main):022:0> x =~ /\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z] +[a-zA-Z]*)/ => 0 irb(main):023:0> $1 => "Having" irb(main):024:0> $2 => "Jane" irb(main):025:0> $3 => "Smith"

You know this is not something you're going to solve with regular expressions, though, right? :slight_smile:

"San Francisco's Jane Smith, quoted in Broder's Washington Post article, said ..."

You need a lot more heuristics than a simple RegEx to reliably find names in a block of text.

Some other cases to consider

John Phillip Sousa (or if you're a kid a heart John Jacob Jingelheimer Smith) not to mention Spanish names which can have MANY parts.

Robert De Niro

Jesus Mary and Joseph

Surnames with origins in some languages don't start with a capital

Michael Henry de Young - Dutch

Wernher von Braun - German

Thanks for the suggestions. I'm going to play around with this.

On the most part, I'm doing detection for scenarios with two names, so names like Robert De Niro will not come up.

-A

Rick Denatale wrote:

I'm pretty sure, though that the actor would say he HAD two names, and his first name was "Robert" and his last name was "De Niro"