Which is faster: searching in a file or in a database

Frederick_Cheung · December 26, 2007, 4:34pm

Hello,

I have ten CVS file with about 20000 rows in each. If I want to search in these files for example get every row that has the word "apple" i a column, which is faster:

If I read all these ten files into the database and then making search in the database

or

If I search directly in the files for this inforamtion?

Thank you! Please motivate your answer and include a code to make the search.

Assuming you aren't processing a fresh set of csv files, I'd be very
surprised if parsing the csv files in ruby each time wasn't slower
than sticking it in the database. It's what databases are supposed to
do. That's just my hunch though, and I'm afraid I've got enough of my
own work to do before doing yours

Fred

Bala_Paranj · December 26, 2007, 8:48pm

You have to do the validation and have a valid CSV file to parse it correctly. Correct the file and try again.

Greg_Donald1 · December 27, 2007, 3:50pm

In bash, using a temp file, it'd be something like:

cp foo.csv foo.csv.tmp sed -e 's/"//g' foo.csv.tmp >foo.csv rm foo.csv.tmp

Mark_Wilden · December 28, 2007, 3:04am

There's always a text editor.

Seriously, using the IO class to 1) read each line of the input file, 2) use gsub to remove the quotes, and 3) write each line of the file, sounds to me like a useful exercise.

///ark