Which is faster: searching in a file or in a database


I have ten CVS file with about 20000 rows in each. If I want to search in these files for example get every row that has the word "apple" i a column, which is faster:

If I read all these ten files into the database and then making search in the database


If I search directly in the files for this inforamtion?

Thank you! Please motivate your answer and include a code to make the search.

Assuming you aren't processing a fresh set of csv files, I'd be very
surprised if parsing the csv files in ruby each time wasn't slower
than sticking it in the database. It's what databases are supposed to
do. That's just my hunch though, and I'm afraid I've got enough of my
own work to do before doing yours :slight_smile:


You have to do the validation and have a valid CSV file to parse it correctly. Correct the file and try again.

In bash, using a temp file, it'd be something like:

cp foo.csv foo.csv.tmp sed -e 's/"//g' foo.csv.tmp >foo.csv rm foo.csv.tmp

There's always a text editor. :slight_smile:

Seriously, using the IO class to 1) read each line of the input file, 2) use gsub to remove the quotes, and 3) write each line of the file, sounds to me like a useful exercise.