i have strings that i need to extract keywords from. the string might
have html tags, urls, etc. i need to extract the keywords from the
string. i imagine i'm not the first guy to have to tackle this
problem. is there a gem i can use or anyone have any ideas how to
approach this?
i have strings that i need to extract keywords from. the string might
have html tags, urls, etc. i need to extract the keywords from the
string. i imagine i'm not the first guy to have to tackle this
problem. is there a gem i can use or anyone have any ideas how to
approach this?
More detail needed about the keywords. The simple case is keywords regardless
of context, separated by whitespace.
irb(main):006:0> KEYWORDS = %{if else then end case when do def}
=> "if else then end case when do def"
irb(main):007:0> str = "if true then false else true end"
=> "if true then false else true end"
irb(main):008:0> str.split.find_all{|s| KEYWORDS.include?(s)}
=> ["if", "then", "else", "end"]
irb(main):009:0>
If you need to exclude keywords inside strings, URLs, etc. the solution is
more complex.
thanks for the reply. i can deal with context in a different method,
in your solution, i still grab "<a>" and "test." and "&wow*&&" as
keywords. i want to send this method a string, and get an array of
letter-only words returned. if you have context ideas, i'd love to
hear those too, but the first step is just harvesting only character
words from strings.