BackgrounDRb 1.0 pre-release available now

Hi Folks,

We are glad to announce shiny new release of BackgrounDRb, which will soon become 1.0.

A quick summary of changes:

- BackgrounDRb is no londer DRb, its based on event driven network programming library packet ( http://www.packet.googlecode.com ) .

- Since we moved to packet, many nasty thread issues, result hash corruption issues are totally gone. Lots of work has went in   making scheduler rock solid stable.

- Each worker, still runs in its own process, but each worker has a event loop of its own and all the events are triggered by the internal   reactor loop. In a nutshell, you are not encouraged to use threads in your workers now. All the workers are already concurrent, but you   are encouraged to use co-operative multitasking, rather than pre-emptive. A simple example is,

  For implement something like progress bar in old version of bdrb, you would:     - start your processing a thread (so as your worker can receive further request from rails ) and have a instance       variable ( protected by mutex ) which is updated on progress and can be send to rails.

  With new backgroundrb, progrss bar would be:     - process your damn request and just use register_status() to register status of your worker. Just because       you are doing some processing won't mean that your worker will block. It can still receive requests from rails.

- Now, you can schedule mulitple methods with their own triggers.

- Inside each worker, you can start tcp server or connect to a external server. Two important methods available in all workers are:

   start_server("localhost",port,ModuleName)    connect("localhost",port,ModuleName)

  Connected client or outgoing connection would be integrated with Event Loop and you can process requests from these guys   asynchronously. This mouse trap can allow you to build truly distributed workers across your network.

The detailed list of changes can be found here:

http://backgroundrb.rubyforge.org/

Please give it a try and let me know if you found any bugs.

Hi

Does this means that slave/daemons are not the dependency anymore?

Yes, its gone. bdrb no longer depends on slave and daemons.

By 'not encouraged' do you mean that 1.0 is not supporting multiple threads in the worker or just as a general guidance?

Could you please comment, how would you approach the following scenario with 1.0. Currently, we have a worker that creates threads that process financial payment transactions. An http request sends several 10s or 100s payment transaction records. They are handled by the single worker instance. Within the worker there is a pool of threads created that is calculated based on the number of transactions. For example for 200 transactions there will be 20 threads where each thread handles 10 requests in a squence. Each transaction takes about 3-5 seconds, so our throughput is significantly improved by internal worker parallelization with a thread pool. The worker periodically updates custom backgroundjob databse record, so that following ajax request from the client can read the status of the worker process. The job is identified with the worker key that is stored in the session.

Its not encouraged, thats all. You can still have threads in your workers. However, I am planning to add thread pool feature in bdrb itself, that should simplify things a bit.

Also ideally, when using EventDriven network programming, you want all your sockets within select loop for efficiency. So, you wouldn't need any damn threads, if you can use a HTTP handler that works in Evented manner. What i mean to say is, you don't do this:

a = Net::HTTP.get("http://www.google.com")

but you do,

Backgroundrb::HTTP.open("http://www.google.com") do |data| process_data(data) end

What I am trying to illustrate is, when you ask to open, google.com page, evented model allows you to attach callback ( the block in this case ), which will be called when data arrives from google.com, rather than waiting for it in a thread. So, BackgrounDRb::HTTP.open() returns immediately. And you are concurrent as hell.

But this is not possible, because if you are charging cards, then you are probably using ActiveMerchant which is using Net::HTTP and which blocks when you make request. But trust me, writing a simple http client is not that difficult, there is already connect() available in all workers.

How this works with fastcgi or multiple mongrel based engines where it is not guaranteed to hit the same process with the next request? We are using custom database tables and code for sharing the status information now but I was wandering whether the plumbing includes something to address this.

Thats no problem at all, BackgrounDRb is a TCP server, so if you have followed the README file, no matter from which machine, you are making the request if you are specifying worker X, then its guaranteed to hit the same worker(with optional job_key if you are starting your worker dynamically)

At one point with the old version it was fairly straight-forward to
test workers, but that broke at one point. Could you give any
pointers writing tests for workers in the new version?

Hi Brandon,

update your bdrb copy from svn and run rake backgroundrb:setup and you should have a RAILS_ROOT/test/bdrb_test_helper.rb file.

Now, all your worker test cases can go in RAILS_ROOT/test/unit directory, just make sure that you require bdrb_test_helper file, and you can write test cases.

For example:

require File.join(File.dirname(__FILE__) + "/../bdrb_test_helper") require "god_worker"

context "When god worker starts" do   setup do     god_worker = GodWorker.new   end end

I hope this helps.

Hemant, this looks great. Could one use BackgroundRb to have workers interact programatically with a remote telnet service? Or would I simply start a worker that does this interaction via a shell/spawn/telnet/expect...

Great doco too, thanks.

George

Sure as hell.. with any tcp service actually in a evented manner. However, that area is not polished ( no one ever asked. :slight_smile: )

@Hemant :

Has this been tested on Windows? If so, are there known issues? Previous versions did not work on Windows, although the original version did.

Hi

We are actually on of the ActiveMerchant providers (E-Xact), so strictly we are talking what is actually behind ActiveMerchant. There are many protocols involved in financial networks, depending where the transaction is routed. We are very familiar with Reactor engines and patterns you are advocating, and they work great, especially in uniform scenarios without throttling, sequencing etc. In our case, I don't see a clear gain I'm afraid. While a thread pool was done in no-time and is dead simple maintain, test etc.

Cool. You can use existing approach provided you handle your threads with as much care. I will get back to this in sometime. There are other ways also, that I am looking. For example: co-routines ( on top of fibers ) from Ruby1.9. Just watch bdrb mailing list, or submit some patch. As i guess, you guys are already running somewhat customized version of bdrb.

Our Rails cluster runs bdrb on each Rails server and uses domain sockets. This to avoid a single point of failure and have uniform architecture. Would that work too? That is, does bdrb now works sort of like memcache where each server knows of every other instance? But even with that in place, in fastcgi for example, fastcgi processor may recycle the Rails process where callback has been registered.

Hmm, this is cool. So, how did you handle this situation earlier? Prolly, what you can do is, have bdrb instances running on each cluster and have cluster specific backgroundrb configuration file. So as, requests from mongrels running on cluster1 will be served by bdrb running on cluster1 only, and update some db/memcache key to indicate it, so as even if next time request goes to another worker on another machine, you know the state. Again, I would love any patch, ideas from you and I am myself working on something like this, which would avoid logging to db and stuff.

No, it won't work on Windows. Even when I removed "slave", still we need unix domain sockets for internal communication, which is not available on windows.

@Hemant:

Thank you. That’s what I suspected.

Hi,

Looking forward to a chance to use this library. Thanks for the work!

Hi

How does this affect the licensing of BackgrounDRb (not to mention the
name of the project :-)? The packet library is GPLv2 (the url doesn't
have the leading www by the way), while BackgrounDRb is dual licensed
with the Ruby License or an MIT license.

Damn I realized it after posting the message. But then thought "packet" may be irrelevant anyways ( to rails guys i mean )

Regarding license issues, since packet is dual licensed under GPL2 and Ruby, you can take shit from packet and embed in your app and forget that its under GPL2, since Ruby license allows you do that. There is a clause from Ruby license that says:

"place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or by allowing the author to include your modifications in the software."

So, I guess its ok to have that.

Sorry about wrong link, correct one is: http://packet.googlecode.com

Well, i think one of the strengths of packet is, it lets you write tightly integrated workers with master process. So, this way, you can offload blocking tasks to these workers, which will run parallely and keep processing further requests in master.

And its pure ruby.