way to divide long article and store in database

11175 · May 31, 2009, 8:39pm

I wonder if a Ruby on Rails developer has encounter this before: suppose it is a long article (say 100,000 words), and I need to write a Ruby file to display page 1, 2, or page 38 of the article, by

display.html.erb?page=38

but the number of words for each page can change over time (for example, right now if it is 500 words per page, but next month, we can change it to 300 words per page easily). What is a good way to divide the long article and store into the database?

P.S. The design may be complicated if we want to display 500 words but include whole paragraphs. That is, if we are showing word 480 already but the paragraph has 100 more words remaining, show those 100 words anyway even though it exceeds the 500 words limit.

11175 · May 31, 2009, 10:28pm

Make each page a text file, put them all in a directory (document/1.txt, document/2.txt, etc), and then you won't even have to use the database.

11175 · June 1, 2009, 3:09am

Jian Lin wrote:

I wonder if a Ruby on Rails developer has encounter this before: suppose it is a long article (say 100,000 words), and I need to write a Ruby file to display page 1, 2, or page 38 of the article, by

display.html.erb?page=38

but the number of words for each page can change over time (for example, right now if it is 500 words per page, but next month, we can change it to 300 words per page easily

Why divide it in the database? Store it one field in the database, and when you fetch it from the database just perform the logic to find page=38 and then display that.

If actual testing indicates that's too slow with the actual quantity of data you expect, then you'd have to perform a word-boundary calculation on inserting the value in the db, and store the results as an 'index' to the text somehow.

Either way, I don't see any reason to actually split up the text file in the db. Unless you want to let the user _search_ for, say, word X on page N of the text. But then you're getting into complicated enough text searching land that I'd investigate using something like lucene/solr to index your text, instead of an rdbms, and seeing what support for page-boundary-based-searching eg lucene/solr have.

11175 · June 1, 2009, 3:14am

Jonathan Rochkind wrote:

Jian Lin wrote:

I wonder if a Ruby on Rails developer has encounter this before: suppose it is a long article (say 100,000 words), and I need to write a Ruby file to display page 1, 2, or page 38 of the article, by

display.html.erb?page=38

but the number of words for each page can change over time (for example, right now if it is 500 words per page, but next month, we can change it to 300 words per page easily

Why divide it in the database? Store it one field in the database, and when you fetch it from the database just perform the logic to find page=38 and then display that.

is it true that it all the 100,000 words are in one record (one row), then every time, the whole field needs to be retrieved. If we assume one work is about 6 characters long (with the space), then it is 600kbyte per read. I hope to make it "read as needed"... 500 words and about 3kbyte read per page each time.

nodoubtarockstar · June 1, 2009, 6:26am

If you *must* split it up in the database, your changing your mind from 500 to 300 is going to suck, otherwise you might use a "pages" assocation or something of the like which would be very simple...

for instance:

class Article < ActiveRecord::Base has_many :pages

validates_presence_of :text

after_create i = 0 b = text.scan(/\b\S+\b/) b.each_slice(500) do |x| self.pages.create(:page => i+=1, :text => x.join(" ")) end end

end

class Page < ActiveRecord::Base belongs_to :article end

Someone probably has a MUCH prettier method of doing this, was just kind of on-the-fly...

Cheers!

Topic		Replies	Views
Best way to store Article body? rubyonrails-talk	2	109	February 22, 2007
SQL Server Pagination code here if you need it rubyonrails-talk	0	120	November 3, 2007
Store a single variable / row in database rubyonrails-talk	4	304	September 4, 2010
updating the db 6000 times will take few minutes ? rubyonrails-talk	31	442	May 10, 2009
Question on application/database design for a application port to rails rubyonrails-talk	5	169	March 14, 2007

way to divide long article and store in database

Related topics

More Resources