Searching/Sorting an Array of Hashes

I have an array of hashes that contains several fields, including first_name and last_name. Unfortunately, since its the result of an API call, I have no other ways to work with it. Regardless, I'm trying to build a basic search function where a user can enter a name and it will display the results from a newly created array.

I'm guessing that sort_by will be the best route to go, but I've been unsuccessful in finding out how to use it with multiple fields. Any guesses?

The second part to the question is how you structure the sort_by, if that is the best way, to find objects that are similar to the requested query. It's not so much that a user would mispell a name (although that would be helpful) but if they put in a firstname + lastname pair, it wouldn't technically match with either field on its own.

Thanks in advance. :slight_smile:

Your question needs a bit of clarification--please use a examples of the query, the data set format to be searched, and the expected result. And then repeat for the case that you are having trouble with.

But I'll throw out some thoughts even though I am confused about what you are trying to do...

Here is an example of how to sort_by last_name on an array of hashes with first_name and last_name keys . array.sort_by{|hash| hash[:last_name]}

irb

test = [{:first_name=>'tim', :last_name=>'rand'},{:first_name=>'jim', :last_name=>'band'},{:first_name=>'him', :last_name=>'crand'}]

=> [{:first_name=>"tim", :last_name=>"rand"}, {:first_name=>"jim", :last_name=>"band"}, {:first_name=>"him", :last_name=>"crand"}]

test

=> [{:first_name=>"tim", :last_name=>"rand"}, {:first_name=>"jim", :last_name=>"band"}, {:first_name=>"him", :last_name=>"crand"}]

test.sort_by{|hash| hash[:first_name]}

=> [{:first_name=>"him", :last_name=>"crand"}, {:first_name=>"jim", :last_name=>"band"}, {:first_name=>"tim", :last_name=>"rand"}]

test.sort_by{|hash| hash[:last_name]}

=> [{:first_name=>"jim", :last_name=>"band"}, {:first_name=>"him", :last_name=>"crand"}, {:first_name=>"tim", :last_name=>"rand"}]

To find misspelled names is a bit trickier--I would probably use the text rubygem as it has the ability to calculate the Levenshtein distance (basically number of substitutions, deletions and insertions) required to spell the target using a query. You would have to compare the query to all names and sort based on the levenshtein distance and then pull the closest match.I have used that strategy in the past and it works. Here is a quick demo of the syntax for the levenshtein distance:

irb

require 'text'

=> true

Text::Levenshtein.distance('this', 'that')

=> 2

Text::Levenshtein.distance('query', 'queen')

=> 2

To the extent that I think I understand your question, I bet having some verification is going to be unavoidable. Something like the following to catch cases when people type in a space separated first and last name.

if query.match(" ") #query is something like "first last" query_first, query_last = "first last".split(/ /)[0], "first last".split(/ /)[1] else query_first = query_last = query end

Hope that helps, Tim

Tim,

I really appreciate the time and thoughtfulness you put in your reply. To clarify further from my original question, building from your example.

contacts = [{:first_name=>'tim', :last_name=>'rand', :id = 1},{:first_name=>'jim', :last_name=>'band', :id => 2},{:first_name=>'him', :last_name=>'crand', :id => 3}]

Using a search form, the user will submit a string, looking for a particular contact in the array. Unfortunately, this might be just "tim" or "rand" or "tim rand". If a match is found in the array, I need to return the id number associated with the match.

Now, if I was accessing the information from a database table directly instead of an array, something like this would probably suffice.

@contacts = Contact.find(:all, :conditions => [ 'LOWER(lastname) LIKE ? OR LOWER(firstname) LIKE ?', '%' + value.downcase + '%','%' + value.downcase + '%'])

Unfortunately, I'm not sure how to build the equivalent query for an existing array. Sort_by helps, but I haven't found a way to allow it to search both :first_name and :last_name - only one at a time.

# item is a hash, contacts in an array - using the Array#find or Array#detect method (they're synonymous) # assuming search_string contains the string you want to find found_item = contacts.detect{|item| item.values.any?{|value| value.include?(search_string)}}

or

found_item = contacts.detect{|item| item.values.join.include?(search_string)}

then, it's simply a matter of getting the id value from the found_item. As this is a hash, just found_item[:id] should suffice.

Julian.

Hi again Robert, There might be methods build into rails for doing this, but when you have a very specific case, you might just roll out your own methods to get exactly what you want:

=begin given a data structure like @contacts = [{:first_name=>'tim', :last_name=>'rand', :id = 1},{:first_name=>'jim', :last_name=>'band', :id => 2},{:first_name=>'him', :last_name=>'crand', :id => 3}] and given a query may be first, last, or both names return id number for matches =end

#here is our search array @contacts = [{:first_name=>'tim', :last_name=>'rand', :id => 1},{:first_name=>'jim', :last_name=>'band', :id => 2},{:first_name=>'him', :last_name=>'crand', :id => 3}, {:first_name=>'shim', :last_name=>'crand', :id => 4}]

#method to separate names if more than one is given def parse_query(query)   if query.match(" ")     name1, name2 = query.split(/ /)   else     name1 = query     return name1.to_a   end   return [name1, name2] end

#find any name in hash field and return the ids def search_array_with_hashes(array_with_name_or_names)   @hits =   #search first names   array_with_name_or_names.each do |name|   @contacts.each do |hash|     @hits << hash[:id] if hash.values.include?(name)   end   end   @hits.uniq end

#usage/test case examples p search_array_with_hashes(parse_query("band")) p search_array_with_hashes(parse_query("tim rand")) p search_array_with_hashes(parse_query("crand")) # >> [2] # >> [1] # >> [3, 4]

Will that do the trick? Tim

It looks like a typo got introduced as you were moving the method into your rails app. There is no values_at call in my method, perhaps you accidentally tab completed and inadvertently introduced the _at. Good luck. Tim

Sorry, Tim. I should have clarified. This is the line that is having issues:

@hits << hash[:id] if hash.values.include?(name)

Specifically, the .values part.

timr wrote:

Sorry, but that code below is really unidiomatic ruby.

Given the following:

given a data structure like @contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id = 1},{:first_name=>‘jim’, :last_name=>‘band’, :id => 2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}] and given a query may be first, last, or both names return id number for matches

@contacts = [{:first_name=>‘tim’, :last_name=>‘rand’, :id => 1},{:first_name=>‘jim’, :last_name=>‘band’, :id =>2},{:first_name=>‘him’, :last_name=>‘crand’, :id => 3}]

keywords = “ran”

@contacts.select{|hash| keywords.split.any?{|keyword| hash.values.join.include?(keyword)}}.map{|hash| hash[:id]}

or more prettily:

@contacts.select do |hash|

keywords.split.any? do |keyword|

	hash.values.join.include?(keyword)

end

end. # note the period at the end of this line… indicating we still want to send the result of this select method another message yet… (ie the map message below).

map do |hash|

hash[:id]

end

=> [1, 3]

if you really need to make a method of it (tho I don’t know why you would), you can do so thusly:

class ArrayOfHashes < Array

def search_array_with_hashes(keywords)

	found_hashes = self.select{|hash| keywords.split.any?{|keyword| hash.values.join.include?(keyword)}}

	found_hashes.map{|hash| hash[:id]}

end

end

@contacts = ArrayOfHashes.new(@contacts)

@contacts.search_array_with_hashes(“ran”)

=> [1, 3]

@contacts.search_array_with_hashes(“band”)

=> [2]

@contacts.search_array_with_hashes(“tim rand”)

=> [1, 3]

@contacts.search_array_with_hashes(“crand”)

=> [3]

@contacts.search_array_with_hashes(“jam”)

=>

Julian solution is more elegant. I like it. It has a functional difference in that it catches partial names--for instance tim would match to timothy (probably good in this case)--but rand would match crand.

That being the case searching tim matches [1], but tim rand matches [1,3]. More information leads to less specificity. when a perfect match is available, that should be the only item returned--i would think.