Question on application/database design for a application port to rails

Hi all

I am hoping that the experience and knowledge of this 'list' will be able to help me out with some design decisions I have to make while porting a desktop app to rails.

Here is the scoop: I am in the design process of porting a fairly large client/server app to rails. Average data set is about 200 MB per database/server. Altogether there are about 100 tables and about 500 stored procedures. Obviously this is a complete rewrite of the app, but while I am at it I might as well solve some outstanding issues. Anyhow here are the questions:

1. Has anybody have any experience with very large databases using rails? Did you use one database per client or multiple "client data sets" in one database?

2. If I decide to go with multiple client datasets stored in one database is there an easy way for active record to limit result sets by login id or client id?

Obviously it would make a large difference in development time if you could use something like Customer.find_by_name vs Customer.find (search by name and loginid or some other restriction to the dataset).

I know this example is trivial but there will be a lot of queries to the database that will have to restrict the data based on who is logged in and this will all add up.

3. And finally a question about REST. Does anybody know if the original philosophy of developing apps in rails (controller/action/id) will be deprecated and eliminated in future versions of Rails in favor of REST? I have dabbled with some restful controllers and to be truthful I prefer the original way. There are some scenarios where I will use REST but I would hate to be forced to do the complete app in a restful way.

Thanks, I would really appreciate any input.


Rafael Szuminski wrote:

I am in the design process of porting a fairly large client/server app to rails. Average data set is about 200 MB per database/server. Altogether there are about 100 tables and about 500 stored procedures. Obviously this is a complete rewrite of the app, but while I am at it I might as well solve some outstanding issues. Anyhow here are the questions:

Before you go farther, think about how you will write unit tests for all that.

For example, in some circumstances, I would write tests on the original system, and then port the tests as I port the code. I'm aware that myriad cultural and technical issues might conflict with that goal. But your job now is to extract as many latent business rules from the existing system as you can, so don't go overlooking any.

Next, why are you rewriting? A noted process guru, either Ward Cunningham or Martin Fowler, recommends a "strangler fig" strategy. If you look up the lifecycle of a strangler fig you'll understand immediately. You should ask the customer what's the most important feature to add to the existing system, and you add it using Rails. Then you also add a few easy features that replace the existing system - "for free". Repeat like this, ostensibly adding requested features with replacement features (and releasing them) until your fig completely covers and strangles this obsolete tree.

Your current plan expects to work a very long time without any releases - that is a super-bad strategy because the longer you go, the higher the risk of a mismatch between what you write and what users need.

1. Has anybody have any experience with very large databases


using rails?

Yes. Databases are screwy specifically because they enable huge data sets, so I know if I use ActiveRecord correctly, and add a few foreign keys to speed up queries, I should be safe.

If this is important, some of your early unit tests should deal with ten billion records, to see what happens. (I would give such a test batch a "fast mode", to only deal with ten records for most runs. I would run the "slow mode" only after upgrading the database".)

You need to learn "migrations", because in your case you might find yourself actually migrating data out of the old database and into the new one. Alternately, you could simply point your database.yml at the old database, and set its table names in ActiveRecord explicitly.

Next, always test your models have a working destroy option. (Note that some databases should never destroy records, and should use acts_as_versioned instead.) Either way, destroy will fail if a foreign key would break, so add lots of unit tests that destroy things, as you add tests and features that construct things.

Did you use one database per client or multiple "client data sets" in one database?

I would do the simplest thing that could possibly work here. Databases are designed to index and cross-ref arbitrarily huge data blocks, so if the engine doesn't care about multiple customer databases, then I don't.

2. If I decide to go with multiple client datasets stored in one database is there an easy way for active record to limit result sets by login id or client id?

Read /Rails Recipes/ (it covers some of your other questions), and then use with_scope(:find) to run a set of find(:all) command across one database subset, not :all.

Obviously it would make a large difference in development time if you could use something like Customer.find_by_name vs Customer.find (search by name and loginid or some other restriction to the dataset).

Maybe. I have found that unit testing has a much greater influence on development time. For example, if I want to write a cheap simple command, but don't know if it will work as well as a long crufty command, I can run the tests and see.

I know this example is trivial but there will be a lot of queries to the database that will have to restrict the data based on who is logged in and this will all add up.

That is /Rails Recipes's/ exact scenario for with_scope.

3. And finally a question about REST. Does anybody know if the original philosophy of developing apps in rails (controller/action/id) will be deprecated and eliminated in future versions of Rails in favor of REST?

Balderdash. controller/action/id IS the simplest form of REST. It's here to stay.

I have dabbled with some restful controllers and to be truthful I prefer the original way. There are some scenarios where I will use REST but I would hate to be forced to do the complete app in a restful way.

What URLs would your customers like?

  1. Has anybody have any experience with very large databases using rails? Did you use one database per client or multiple "client data

sets" in one database?

This one isn’t for me.

  1. If I decide to go with multiple client datasets stored in one database is there an easy way for active record to limit result sets by login id or client id?

Obviously it would make a large difference in development time if you

could use something like Customer.find_by_name vs Customer.find (search by name and loginid or some other restriction to the dataset).

I know this example is trivial but there will be a lot of queries to

the database that will have to restrict the data based on who is logged in and this will all add up.

You said it: Customer.find_by_name Also: Customer.find_by_name_and_id Customer.find(:all, :conditions => [‘name = ? and id = ?’, name, id])

Also, the associations that you make in Rails allow you to restrict stuff by default. So, firstly, if you want to find all the (just making this up) locations belonging to a single user, you can do: class User < ActiveRecord::Base

has_many :locations end class Location < ActiveRecord::Base belongs_to :user end Then, when you type ‘user.locations’, it will automatically perform the search as Location.find(:all, :conditions => [‘user_id = ?’, user_id])

  1. And finally a question about REST. Does anybody know if the original philosophy of developing apps in rails (controller/action/id)

will be deprecated and eliminated in future versions of Rails in favor of REST? I have dabbled with some restful controllers and to be truthful I prefer the original way. There are some scenarios where I will use REST but I would hate to be forced to do the complete app in

a restful way.

The chances of this being completely deprecated in favor of REST is very, very small. REST seems, at least to me, to be a good design principle , and it is extremely good for machine to machine communcation, but there are times when violating REST, or just programming an app with a different design, can be desired or even required. That being said, as Rails is a “convention over configuration” type of framework, expect that they will be adding more and more shortcuts for developing REST in, if that’s the most common usage… just don’t expect them to remove the ability to do anything other than REST.


see inline

Rafael Szuminski wrote:

Hi all

I am hoping that the experience and knowledge of this 'list' will be able to help me out with some design decisions I have to make while porting a desktop app to rails.

Here is the scoop: I am in the design process of porting a fairly large client/server app to rails. Average data set is about 200 MB per database/server. Altogether there are about 100 tables and about 500 stored procedures. Obviously this is a complete rewrite of the app, but while I am at it I might as well solve some outstanding issues. Anyhow here are the questions:

1. Has anybody have any experience with very large databases using rails? Did you use one database per client or multiple "client data sets" in one database?

2. If I decide to go with multiple client datasets stored in one database is there an easy way for active record to limit result sets by login id or client id?

Obviously it would make a large difference in development time if you could use something like Customer.find_by_name vs Customer.find (search by name and loginid or some other restriction to the dataset).

I haven't seen anything in RoR that will automatically restrict the select set based on some you will probably need to do it yourself.

I know this example is trivial but there will be a lot of queries to the database that will have to restrict the data based on who is logged in and this will all add up.

3. And finally a question about REST. Does anybody know if the original philosophy of developing apps in rails (controller/action/id) will be deprecated and eliminated in future versions of Rails in favor of REST? I have dabbled with some restful controllers and to be truthful I prefer the original way. There are some scenarios where I will use REST but I would hate to be forced to do the complete app in a restful way.

You are able to configure how your URLs are routed to your controllers and actions by modifying the application's route.rb file. I don't believe that RoR will dictate that you should use RESTful URIs in your application

cheers </jima>


thanks for your reply. To clarify a few things: 1. The reason for a port is that my customers are demanding a web/online version of my application 2. I have always written everything in chunks and released in chunks, so no plans to change there. In addition, KISS is my mantra :wink: 3. Of course I will use migrations for the db, although I might stick ar_fixtures for data migration

Before you go farther, think about how you will write unit tests for all that.

Well, going with one database per client would simplify writing tests, but I don't think it will be the deciding factor. But I will have to take a closer look at it anyhow since in the past I have neglected some of the testing.....

Yes. Databases are screwy specifically because they enable huge data sets, so I know if I use ActiveRecord correctly, and add a few foreign keys to speed up queries, I should be safe.

Yes they are, aren't they? The current back-end for my desktop app is Firebird which is a very fine database but also a very unforgiving. If you write sloppy SQL it will kill your performance. I thought about using it with AR, but the Firebird adapter is still fairly new and I don't know if it will do the job. Also some of the auto-created joins by AR will never perform well with Firebird so it would require quite extensive use of stored procs. Well, that leaves Postgre or MSSQL. (MySQL is not an option for me)

Read /Rails Recipes/ (it covers some of your other questions), and then use with_scope(:find) to run a set of find(:all) command across one database subset, not :all

. I looked up nested_with_scope plugin and it looks very promising. I will have to do some performance tests and see the SQL output

Balderdash. controller/action/id IS the simplest form of REST. It's here to stay.

Let's hope.

This is just an fyi so that anybody searching the list archives gets to see a solution to this problem. Anyhow, the key was with_scope (thanks Phlip).

I have decided to store all data sets in one database and just append an account_id to each table. This will allow for searching with Account.Customers.find or Account.orders.find etc.

In addition, each model (except account) will have the following two methods added:

def self.find(*args)   with_scope :find => {:conditions => ['account_id=?','123456789012345678901234567890XX']} do     super   end end

def self.method_missing(method, *args, &block)   with_scope :find => {:conditions => ['account_id=?','123456789012345678901234567890XX']} do     super(method, *args, &block)   end end

where the account id will be dynamic based on the user currently logged in. These two methods will allow for searching by Customer.find(:all) or Customer.find_by_last_name and still have the result set filtered by account id.

I have not implemented the :create option for with_scope because each of my models already has a before_create method implemented in an helper that sets the id to a GUID so it was no brainer to add setting of account id.

Anyhow, thanks for the help
