Architecture question

Hi guys,

I need to make an "architecture" choice for my app. I believe it corresponds to a fairly common case, yet I can't find much input online. I was hoping you could give me some recommendations or point me in the right direction.

The app is a kind of mashup, so it does two things: - handle users requests. I call it the "live" part. - at regular intervals, build cache information, then merge it with the main info. It is my "background" part.

The background part will run once a day every day, and it takes a lot of time and resources (it connects to many web services).

So here are my initial thoughts: - create a background process for the "background" part - have the main process handle the "live" part.

But if I do this, a few things are not obvious to me, especially since it involves threads:

1- in the background process, I will have to run threads (indeed I need to run the requests to the web services in parallel, otherwise it will take too long). Is that doable or is it "dangerous"? 2- can the cache tables be in the same database as the main one, or shall I create a totally different DB? 3- when the merging happens, will it "work" (since both the background process and the main one will want to deal with the same database)?

Please do let me know if I am getting the whole thing wrong (or if I see problems where they don't exist). I keep reading that I need to be very careful with threads/processes and connections to the DB, but I am not quite sure where is the limit and what are very bad design choices.

Thanks a lot! Pierre

Quoting PierreW <wamrewam@googlemail.com>:

Hi guys,

I need to make an "architecture" choice for my app. I believe it corresponds to a fairly common case, yet I can't find much input online. I was hoping you could give me some recommendations or point me in the right direction.

The app is a kind of mashup, so it does two things: - handle users requests. I call it the "live" part. - at regular intervals, build cache information, then merge it with the main info. It is my "background" part.

There are a number of Ruby/Rails background processors (Workling/Starling, BackgroundRb, etc.). Most are multi-threaded or multi-process. Have a cronjob (or use any built-in cron type scheduling) to dump a bunch of Web service requests into the queue and let the background processor handle the multi-tasking.

Any decent database server can handle requests from multiple processes. You may need to use some kind of locking (table locking, row locking, transactions) to handle overlapping read/update/write requests. I know MySQL and Postgres can handle this. Probably MSSQL (I don't work w/ Windows). I don't know enough about sqlite3 to say.

Use any of the Rails servers (Mongrel, Passenger, Apache, Nginx, ...) for the live part. Start with one of the debug/development friendly, single-threaded servers (Webrick and others). Deploy on a heavier duty server that can handle the expected load, or at least the load until your idea proves itself, or disproves itself with real users.

There are a number of solutions of varying scale that have relatively light switching costs. Start small and friendly and then swap pieces of the solution as the load increases.

And don't chase the latest and greatest, "best" technology at the expense of developing your idea.

HTH,   Jeffrey