Searching/Sorting an Array of Hashes

I have an array of hashes that contains several fields, including
first_name and last_name. Unfortunately, since its the result of an API
call, I have no other ways to work with it. Regardless, I'm trying to
build a basic search function where a user can enter a name and it will
display the results from a newly created array.

I'm guessing that sort_by will be the best route to go, but I've been
unsuccessful in finding out how to use it with multiple fields. Any
guesses?

The second part to the question is how you structure the sort_by, if
that is the best way, to find objects that are similar to the requested
query. It's not so much that a user would mispell a name (although that
would be helpful) but if they put in a firstname + lastname pair, it
wouldn't technically match with either field on its own.

Thanks in advance. :slight_smile:

Your question needs a bit of clarification--please use a examples of
the query, the data set format to be searched, and the expected
result. And then repeat for the case that you are having trouble
with.

But I'll throw out some thoughts even though I am confused about what
you are trying to do...

Here is an example of how to sort_by last_name on an array of hashes
with first_name and last_name keys .
array.sort_by{|hash| hash[:last_name]}

irb

test = [{:first_name=>'tim', :last_name=>'rand'},{:first_name=>'jim', :last_name=>'band'},{:first_name=>'him', :last_name=>'crand'}]

=> [{:first_name=>"tim", :last_name=>"rand"},
{:first_name=>"jim", :last_name=>"band"},
{:first_name=>"him", :last_name=>"crand"}]

test

=> [{:first_name=>"tim", :last_name=>"rand"},
{:first_name=>"jim", :last_name=>"band"},
{:first_name=>"him", :last_name=>"crand"}]

test.sort_by{|hash| hash[:first_name]}

=> [{:first_name=>"him", :last_name=>"crand"},
{:first_name=>"jim", :last_name=>"band"},
{:first_name=>"tim", :last_name=>"rand"}]

test.sort_by{|hash| hash[:last_name]}

=> [{:first_name=>"jim", :last_name=>"band"},
{:first_name=>"him", :last_name=>"crand"},
{:first_name=>"tim", :last_name=>"rand"}]

To find misspelled names is a bit trickier--I would probably use the
text rubygem as it has the ability to calculate the Levenshtein
distance (basically number of substitutions, deletions and insertions)
required to spell the target using a query. You would have to compare
the query to all names and sort based on the levenshtein distance and
then pull the closest match.I have used that strategy in the past and
it works. Here is a quick demo of the syntax for the levenshtein
distance:

irb

require 'text'

=> true

Text::Levenshtein.distance('this', 'that')

=> 2

Text::Levenshtein.distance('query', 'queen')

=> 2

To the extent that I think I understand your question, I bet having
some verification is going to be unavoidable. Something like the
following to catch cases when people type in a space separated first
and last name.

if query.match(" ") #query is something like "first last"
query_first, query_last = "first last".split(/ /)[0], "first last".split(/ /)[1]
else
query_first = query_last = query
end

Hope that helps,
Tim

Tim,

I really appreciate the time and thoughtfulness you put in your reply.
To clarify further from my original question, building from your
example.

contacts = [{:first_name=>'tim', :last_name=>'rand', :id =
1},{:first_name=>'jim', :last_name=>'band', :id =>
2},{:first_name=>'him', :last_name=>'crand', :id => 3}]

Using a search form, the user will submit a string, looking for a
particular contact in the array. Unfortunately, this might be just "tim"
or "rand" or "tim rand". If a match is found in the array, I need to
return the id number associated with the match.

Now, if I was accessing the information from a database table directly
instead of an array, something like this would probably suffice.

@contacts = Contact.find(:all, :conditions => [ 'LOWER(lastname) LIKE ?
OR LOWER(firstname) LIKE ?', '%' + value.downcase + '%','%' +
value.downcase + '%'])

Unfortunately, I'm not sure how to build the equivalent query for an
existing array. Sort_by helps, but I haven't found a way to allow it to
search both :first_name and :last_name - only one at a time.

# item is a hash, contacts in an array - using the Array#find or Array#detect method (they're synonymous)
# assuming search_string contains the string you want to find
found_item = contacts.detect{|item| item.values.any?{|value| value.include?(search_string)}}

or

found_item = contacts.detect{|item| item.values.join.include?(search_string)}

then, it's simply a matter of getting the id value from the found_item. As this is a hash, just found_item[:id] should suffice.

Julian.

Hi again Robert,
There might be methods build into rails for doing this, but when you
have a very specific case, you might just roll out your own methods to
get exactly what you want:

=begin
given a data structure like @contacts =
[{:first_name=>'tim', :last_name=>'rand', :id =
1},{:first_name=>'jim', :last_name=>'band', :id =>
2},{:first_name=>'him', :last_name=>'crand', :id => 3}]
and given a query may be first, last, or both names
return id number for matches
=end

#here is our search array
@contacts = [{:first_name=>'tim', :last_name=>'rand', :id =>
1},{:first_name=>'jim', :last_name=>'band', :id =>
2},{:first_name=>'him', :last_name=>'crand', :id => 3},
{:first_name=>'shim', :last_name=>'crand', :id => 4}]

#method to separate names if more than one is given
def parse_query(query)
  if query.match(" ")
    name1, name2 = query.split(/ /)
  else
    name1 = query
    return name1.to_a
  end
  return [name1, name2]
end

#find any name in hash field and return the ids
def search_array_with_hashes(array_with_name_or_names)
  @hits = []
  #search first names
  array_with_name_or_names.each do |name|
  @contacts.each do |hash|
    @hits << hash[:id] if hash.values.include?(name)
  end
  end
  @hits.uniq
end

#usage/test case examples
p search_array_with_hashes(parse_query("band"))
p search_array_with_hashes(parse_query("tim rand"))
p search_array_with_hashes(parse_query("crand"))
# >> [2]
# >> [1]
# >> [3, 4]

Will that do the trick?
Tim

It looks like a typo got introduced as you were moving the method into
your rails app. There is no values_at call in my method, perhaps you
accidentally tab completed and inadvertently introduced the _at.
Good luck.
Tim

Sorry, Tim. I should have clarified. This is the line that is having
issues:

@hits << hash[:id] if hash.values.include?(name)

Specifically, the .values part.

timr wrote:

Sorry, but that code below is really unidiomatic ruby.

Given the following:

given a data structure like @contacts =
[{:first_name=>‘tim’, :last_name=>‘rand’, :id =
1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>
2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]
and given a query may be first, last, or both names
return id number for matches

@contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id => 1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]

keywords = “ran”

@contacts.select{|hash| keywords.split.any?{|keyword| hash.values.join.include?(keyword)}}.map{|hash| hash[:id]}

or more prettily:

@contacts.select do |hash|

keywords.split.any? do |keyword|

	hash.values.join.include?(keyword)

end

end. # note the period at the end of this line… indicating we still want to send the result of this select method another message yet… (ie the map message below).

map do |hash|

hash[:id]

end

=> [1, 3]

if you really need to make a method of it (tho I don’t know why you would), you can do so thusly:

class ArrayOfHashes < Array

def search_array_with_hashes(keywords)

	found_hashes = self.select{|hash| keywords.split.any?{|keyword| hash.values.join.include?(keyword)}}

	found_hashes.map{|hash| hash[:id]}

end

end

@contacts = ArrayOfHashes.new(@contacts)

@contacts.search_array_with_hashes(“ran”)

=> [1, 3]

@contacts.search_array_with_hashes(“band”)

=> [2]

@contacts.search_array_with_hashes(“tim rand”)

=> [1, 3]

@contacts.search_array_with_hashes(“crand”)

=> [3]

@contacts.search_array_with_hashes(“jam”)

=> []

Julian solution is more elegant. I like it. It has a functional
difference in that it catches partial names--for instance tim would
match to timothy (probably good in this case)--but rand would match
crand.

That being the case searching tim matches [1], but tim rand matches
[1,3]. More information leads to less specificity. when a perfect
match is available, that should be the only item returned--i would
think.