new queueing API

Hi guys, I wanted to discuss the new Queueing API for those of us who
are implementing an out-of-process version. In my case, I write
Sidekiq [1] and would like to support the new API once Rails 4 is
released. My issue is that because the API is object-oriented rather
than message-oriented, implementation of out-of-process workers is
difficult.

The API is Queue#push(job) where job has a run method. Ruby doesn't
have a great solution for serializing a Ruby object across the wire.
Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et
al), JSON can't fully serialize Ruby objects (e.g. symbols) and YAML
has a number of issues in practice that make it painful to use (e.g.
see the monkeypatches DelayedJob has to use [2]).

So I love the simplicity of the API but think it will lead to painful
implementation issues. What do you think about defining a simpler
message format that can be fully serialized and deserialized via JSON
/ YAML / etc instead of using a Ruby object?

mike

[1] http://mperham.github.com/sidekiq/
[2] https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/psych_ext.rb

I would agree. I would prefer a message-oriented API to a Ruby-object-oriented API.

A Ruby object can of course be serialized to a message and deserialized from a message. But the key point is that it’s messages over the wire and in the database, rather than objects, and then we layer on top a convenient object-message mapper - quite like an object-relation mapper or an object-document mapper; also quite like we layer on top of the HTTP request a Hash or an ActionDispatch::Request and likewise on top of the HTTP response a 3-item Array or an ActionDispatch::Response.

The nature of applications with message queues is that they’re more likely to need to interoperate with applications in other languages. Any serialized message should be interoperable across languages, so that rules out Marshal and YAML with embedded Ruby class names, but YAML without embedded Ruby class names would be fine, as would JSON, XML, MessagePack, ProtocolBuffers, etc. The Rails API should not dictate the serialization format, but should permit any set of formats chosen by the application, as well as opaque binary strings.

Note that this has been a problem with Rails session cookies as well - they are not interoperable across applications written in different languages, because they are serialized from Ruby objects into bytes via Marshal.

Hi guys, I wanted to discuss the new Queueing API for those of us who
are implementing an out-of-process version. In my case, I write
Sidekiq [1] and would like to support the new API once Rails 4 is
released. My issue is that because the API is object-oriented rather
than message-oriented, implementation of out-of-process workers is
difficult.

The API is Queue#push(job) where job has a run method. Ruby doesn't
have a great solution for serializing a Ruby object across the wire.
Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et
al), JSON can't fully serialize Ruby objects (e.g. symbols) and YAML
has a number of issues in practice that make it painful to use (e.g.
see the monkeypatches DelayedJob has to use [2]).

I don't understand what you mean by this. Marshal returns a string. If
your producers and consumers are both Ruby, why would this rule out
anything? Are you saying you want to mix languages between producers
and consumers?

So I love the simplicity of the API but think it will lead to painful
implementation issues. What do you think about defining a simpler
message format that can be fully serialized and deserialized via JSON
/ YAML / etc instead of using a Ruby object?

I don't think the serialization format is something that Rails should
define. It's an implementation detail of the queuing system, and is
something a user must understand when using a queue.

For example:

  * an in-memory queue has the advantage doesn't require serialization
    but is volatile. But maybe that's all the programmer needs.

  * a DRb based queue can be distributed and uses marshal, so the user
    doesn't need to understand serialization as much.

  * An object that has references to an IO object must take precautions
    when being serialized.

The other problem is that maybe the JSON format that sidekiq requires
could possibly be different than the JSON format that some other queuing
system requires.

tl;dr every queue has unique aspects regarding transactions, wire
protocol, and storage facility. Users should take this in to account
when selecting a queue for their application's requirements.

Rails should enforce the best practices required to keep a user from making a mistake that would be hard to recover from as their application scales. An In-Memory queue is good, but it’s trivial to build an implementation that can’t be offloaded to another process. So I would like to see this API enforce that all queued objects be serializable/marshal-able at a minimum. This is a best practice and users who don’t like it can trivially build their own simple queueing solution.

Regarding serializing to non-ruby protocols, I agree this is best left as an implementation detail.

Chris

I agree that Rails shouldn’t force a specific serialization scheme, and I definitely think that it should be message-based. So many of the problems people already encounter with queue have to do with poor design in this regard so it’d be nice if Rails pushed the design standard forward.

Rails should enforce the best practices required to keep a user from making
a mistake that would be hard to recover from as their application scales.
An In-Memory queue is good, but it's trivial to build an implementation
that can't be offloaded to another process. So I would like to see this API
*enforce* that all queued objects be serializable/marshal-able at a
minimum. This is a best practice and users who don't like it can trivially
build their own simple queueing solution.

I'm OK with making the default Queue enforce that an object is
marshalable. It seems like a good constraint. If they really want an
in-memory queue, then they can just switch to ::Queue.

Regarding serializing to non-ruby protocols, I agree this is best left as
an implementation detail.

:slight_smile:

Yehuda Katz
(ph) 718.877.1325

Hi guys, I wanted to discuss the new Queueing API for those of us who

are implementing an out-of-process version. In my case, I write

Sidekiq [1] and would like to support the new API once Rails 4 is

released. My issue is that because the API is object-oriented rather

than message-oriented, implementation of out-of-process workers is

difficult.

The API is Queue#push(job) where job has a run method. Ruby doesn’t

have a great solution for serializing a Ruby object across the wire.

Marshal limits the API to Ruby solutions (which rules out RabbitMQ, et

al),

Not particularly? It just requires a Ruby consumer on the other end, which seems like an acceptable constraint. Queues with special serialization requirements (because they cannot use a Ruby consumer) can add additional constraints on serialization, and communicate those constraints to the users of their API.