Rails 3 application capable of generating an offline version of itself for download as zip archive

I'm kinda newbie in RoR yet and I'm having a hard time trying to figure
out how should I implement this. I'm writing an application to store and
display information about insects and their distribution. Currently I
have almost all functionality implemented, except for a **very**
important one: The application must be capable of "crawling" itself and
generate a zip archive for download. Actually, crawling itself isn't
accurate enough, since the views must be sightly different (e.g. don't
provide functionality not available without Internet connection,
indicate in the title that the page is an offline copy, etc).

The question is: Do you have any suggestions as to how I should
implement this?

One approach I had in mind (although I don't know how to program it),
would be calling from the controller that triggers the archive
generation, all the publicly accessible controllers and each of them
providing a non-routable method which uses the offline templates. This
method would be an index-like one except that it will repeatedly use the
offline view for #show, rendering to string and storing in a stream.

Another approach, let the zip generation controller access the views
from all the resources and iterate over all model data, making a really
centralized and big controller.

And lastly, make all public controllers check for "/static/" to tell
them to use the offline templates and then use some self crawling by
iterating over all model data. (For some reason, trying to self-crawl
didn't work for me in development even when using Thin which explicitly
advertises ">> Maximum connections set to 1024" when it boots up. The
problem would probably solve itself by using delayed job, though,
haven't tested yet.)

All the controllers dealing with resources were created with "rails g
scaffold ...".

In all the "solutions" above (except maybe in the third method) are
missing a procedure to include all assets in the zip archive properly
(i.e. enumerate them all and store them with the correct file name).

I'll be extremely grateful for any suggestions about how should I tackle
this problem!

I would check out something like Jekyll.

https://github.com/mojombo/jekyll

A tip when searching for a RoR version of XYZ, Google something like “XYZ ruby rails github gem”. There is a gem to accomplish most complex/mundane tasks.

I would let something like Jekyll handle the static site creation. You could then have a ruby script create a custom config file for each type site or actually create another gem that handles that.

Have everything run in the background like so:

  1. Request comes in for new static build.
  2. Ruby script initializes and custom variables
  3. Ruby script creates unique temp folder. Keep track of this folder so it can be logged and deleted.
  4. Write log to database that process has started.
  5. Ruby script runs jekyll command to create static site.
  6. If no errors write to db log that process completed. Log error otherwise and notify admin via email.
  7. Zip temp directory and notify db log process is complete and zip ready.
  8. Notify user via email or prompt that zip is ready.
  9. Push zip to browser with new pretty file name.
  10. Delete temp zip and directory. Log in db that file was successfully delivered.

This process under a heavy server load can take a while. Use a background task gem that logs to a DB or roll your own.

Utilize your Linux server via bash as much as possible. You can control your server via Ruby. The system in many instances will run tasks like zipping way faster than Ruby. Use Cron for scheduled tasks and the “God” gem for crash detection.

Use good error handling as much as possible and log all “states” so you can debug where issues arise.

Good Luck. I hope I addressed your issue.

My suggestion is I hope simple. Use wget to crawl/mirror the site, using
a query string parameter to indicate you want the "offline" views -- you
still need to implement them if they are different enough -- by checking
that the special parameter is set; you should be able to set it just
once for the session, and have wget use cookies to maintain the session
info.

Another alternative instead of the query string parm could be using the
user agent string wget sends, and always deliver the "offline" version
to that UA string.

The mirroring will pull all the urls that are included under the main
one. If your assets are not under that main url, this won't work. You
can tell wget to pull from elsewhere, but it can easily get out of
hand.

Hope this helps.

Just throwing in an idea, assuming you would want to download these for
documentation/printing purposes :

Why not generate pdf's of all the entries on your website using a gem,
saving them to a specific folder, zip them up with another gem and then
force a download when entering a specific url/triggering a specific
action?

Here's some information to get your adventure started:
http://railscasts.com/episodes/78-generating-pdf-documents
http://railscasts.com/episodes/153-pdfs-with-prawn
http://rubyzip.sourceforge.net/
http://stackoverflow.com/questions/11547689/zipping-files-in-a-directory-with-ruby-on-rails
http://stackoverflow.com/questions/1544446/force-download-of-file-in-the-public-directory-from-a-controller-action

Good luck! If you have more questions, let me know.

Cheers,
Nick

It sounds like you might want to implement an offline app instead. There is browser support in most browsers for appcache to make all your assets available offline, and there is the javascript call navigator.onLine that will tell you if you are offline or not. You’ll have to create and maintain an appcache file. There is a gem that handles that, but I ended up just creating an appcache controller and serving it myself.

This is quite a fundamental architecture change, though, so you might be too far along in development to do it. You could create a small sample app to prove the concept, then drag all your existing code into it.

See http://railscasts.com/episodes/247-offline-apps-part-1 for an introduction.