RegEx help to detect First and Last name

Hello,

I need some help on RegEx to detect First and Last names. This is what I
currently have:
/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/

This is used to detect a First and Last name where two words are next to
each other that begin with a capital letter. So it will detect:
John Smith
Jane Smith

I run into problems where the name is close to the beginning of the
sentence:
Having John Smith over for dinner. --- This will look at "Having John"
Getting Jane Smith ready for school. --- This will look at "Getting
Jane"

Do you know how to do a RegEx where it will ignore the first word
whenever three capitalized words are next to each other? Thanks!

-A

first you have to check whether there is three capital words are there
or two..

if str.match(/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/)
  # Do something

elsif str.match/([A-Z]+[a-zA-Z]* [A-Z]+[a-zA-Z]*)/

  # Do something
end

I hope this will help u..

Thanks

Brijesh Shah

That's close. You want something like

/\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)/

Which gives you

irb(main):021:0> x
=> "Having Jane Smith"
irb(main):022:0> x =~ /\A([A-Z]+[a-zA-Z]*)\s+([A-Z]+[a-zA-Z]*)\s+([A-Z]
+[a-zA-Z]*)/
=> 0
irb(main):023:0> $1
=> "Having"
irb(main):024:0> $2
=> "Jane"
irb(main):025:0> $3
=> "Smith"

You know this is not something you're going to solve with regular
expressions, though, right? :slight_smile:

"San Francisco's Jane Smith, quoted in Broder's Washington Post
article, said ..."

You need a lot more heuristics than a simple RegEx to reliably find
names in a block of text.

Some other cases to consider

John Phillip Sousa (or if you're a kid a heart John Jacob Jingelheimer
Smith) not to mention Spanish names which can have MANY parts.

Robert De Niro

Jesus Mary and Joseph

Surnames with origins in some languages don't start with a capital

Michael Henry de Young - Dutch

Wernher von Braun - German

Thanks for the suggestions. I'm going to play around with this.

On the most part, I'm doing detection for scenarios with two names, so
names like Robert De Niro will not come up.

-A

Rick Denatale wrote:

I'm pretty sure, though that the actor would say he HAD two names, and
his first name was "Robert" and his last name was "De Niro"