named finders and possible security issues

Hi,

while at a security review we noticed the following design/ security flaw in rails/ activerecord's named finders.

Named finders (ex. find_by_confirmation_token) are used when no or exactly one result is desired. Consider the following code:

account = Account.find_by_confirmation_token(params[:secret_token])

Confirmed accounts have their confirmation_token set to nil. Unconfirmed accounts have some digest there. Now when the controller action is called without secret_token set in the params hash a random confirmed account is returned. This is because there's no check if the underlying query returns more than one row. Of course this is not desired.

To help finding those subtile bugs/ design flaws, I'd suggest to remove the "LIMIT 1" from the generated sql by the named finders and instead raise an exception if the query returns more than one row. Suggestion: "named finders expect the sql to return at most one result, got xx".

This change should not have any backwards compatibility issues because a named finder with a query which selects more than one row is never desired and a sign for bad design.

What do you think?

Corin

I agree. The optimistic LIMIT 1 approach to this sort of thing has always been a concern for me (and is the reason I rarely use those dynamic finders).

Model.find_by_something really means "find 1 random entry that matches"

What I do is to introduce a "single" method that raises an error if more than 1 entry is present in an array or relation.

=> Model.find_all_by_something.single (will return nil, 1 instance of Model, or raise an exception)

A nicer implementation, analog to find_all_by could be:

Model.find_single_by_something #=> or returns nil #=> or returns 1 Model instance #=> or raises an Exception ("FoundMultiple")

Regarding implementation and performance, LIMIT 2 is enough to know that the result is not "single" and still limiting the performance hit.

HTH,

Peter

This change should not have any backwards compatibility issues because a named finder with a query which selects more than one row is never desired and a sign for bad design.

What do you think?

It doesn't have backwards compatibility in terms of the values returned to a user, however it would have significantly difference performance characteristics. Removing the limit will make the database examine multiple rows, lots of them in your case, which won't be ideal.

The dynamic finders are currently a small piece of syntactic sugar for 'where x = ? limit 1', I'm not sure that changing them to do something more complete really gains us much.

An easier fix for your case would be a unique index, and a model method which raises ArgmentError when passed null.

Replacing the LIMIT 1 with LIMIT 2 is enough to know that the result is not "single" and I assume that change (from LIMIT 1 to LIMIT 2) would not cause a significant performance hit on the database performance.

Regarding backwards compatibility, I see use cases in our project where Model.find_by_something is actually used to find "1 random entry" from the matching rows (similar to Model.first) and that is the expected behavior there. So, adding the "single" condition on the current find_by_something method would break that functionality. Therefore I propose "find_single_by_something" as a new method to resolve this issue.

HTH,

Peter

It doesn't have backwards compatibility in terms of the values returned to a user, however it would have significantly difference

Currently the values returned are already non-deterministic except if you have a default scope with ordering. This is because most (all?) databases don't guarantee a specific result ordering without an order clause. If backwards compatibilty is an issue here, we could just emit a warning instead of raising. Start raising in rails 3.1.

performance characteristics. Removing the limit will make the database examine multiple rows, lots of them in your case, which won't be ideal.

You are right - but this would only happen once, because we'd then get an exception and fix the bug. Otherwise the bug might never be noticed.

Adding a limit 1 clause does not guarantee good performance. Just take a "select * from users order by random() limit 1". This can be *very slow* if users has many rows because the whole set gets ordered before only a single result is returned.

If you are really concerned about performance (it should never be a problem once there are no bugs/ bad designs left), just replace the LIMIT 1 with a LIMIT 2. As soon as we get > 1 rows we can still raise.

The dynamic finders are currently a small piece of syntactic sugar for 'where x = ? limit 1', I'm not sure that changing them to do something more complete really gains us much.

As explained in my first post I think there are huge gains in terms making users aware of design flaws in their code and so preventing them from potential security issues in their apps.

An easier fix for your case would be a unique index, and a model method which raises ArgmentError when passed null.

Sure, but this will only help in this particular case and not reveal other cases which might still present somewhere.

I'd be happy to provide a patch. I expect it to be only 2-3 lines of code.

Corin

performance characteristics. Removing the limit will make the database examine multiple rows, lots of them in your case, which won't be ideal.

You are right - but this would only happen once, because we'd then get an exception and fix the bug. Otherwise the bug might never be noticed.

Adding a limit 1 clause does not guarantee good performance. Just take a "select * from users order by random() limit 1". This can be *very slow* if users has many rows because the whole set gets ordered before only a single result is returned.

If you are really concerned about performance (it should never be a problem once there are no bugs/ bad designs left), just replace the LIMIT 1 with a LIMIT 2. As soon as we get > 1 rows we can still raise.

You seem to misunderstand my objection. Running the query as you suggest is definitely what your application should do, and anyone else's application which cares about which particular matching value they get back. However to change *already running* applications to suddenly issue a pretty drastically different query isn't backwards compatible by any reasonable definition.

The dynamic finders are currently a small piece of syntactic sugar for 'where x = ? limit 1', I'm not sure that changing them to do something more complete really gains us much.

As explained in my first post I think there are huge gains in terms making users aware of design flaws in their code and so preventing them from potential security issues in their apps.

You're projecting your own use case onto all users of all dynamic finders.

@blog.posts.random.find_by_category("cool") # where random is order('rand()') or similar

That's a perfectly valid use and would no longer be supported by what you're proposing.

Ok, I didn't think of such a use case. I suggest to make people use one of the following code snippets for this use case in rails 3.1:

@blog.posts.random.find_all_by_category("cool").first

or

@blog.posts.random.find_first_by_category("cool")

This would make much clearer what happens behind the scenes. The "normal" dynamic finders should then work as described before.

Corin

account = Account.find_by_confirmation_token(params[:secret_token])

Confirmed accounts have their confirmation_token set to nil. Unconfirmed accounts have some digest there. Now when the controller action is called without secret_token set in the params hash a random confirmed account is returned. This is because there's no check if the underlying query returns more than one row. Of course this is not desired.

The problem here is not so much with #find_by_whatever, but with how you're using it. It is completely legimitate that a finder returns only the first of possibly many matching objects. If you want a specific object, you have to impose a narrower scope or an ordering.

The moral: Don't pass unchecked params to methods.

To help finding those subtile bugs/ design flaws, I'd suggest to remove the "LIMIT 1" from the generated sql by the named finders and instead raise an exception if the query returns more than one row. Suggestion: "named finders expect the sql to return at most one result, got xx".

This would break code like

  Klass.order_by_foo.find_by_bar(...)

I'd suggest that dynamic finders raise an exception when they are passed nil, but I'm pretty sure that this would break existing code, too.

Michael