Patch submitted (#3446)

I've submitted a patch against both 2-3-stable and master that better deals with determining the tables referenced in a SQL string.

Before, periods in literals would confuse this parsing, leading the extraneous and/or non-existent tables being returned by tables_in_string.

This patch takes the approach of replacing all periods inside literals with underscores before scanning the SQL string.

Test cases are also present.

Please take a look.

Thanks,

Andrew

It doesn't appear to handle literals with embedded, escaped quotes (e.g. 'a\'.b') correctly.

Jeremy

Jeremy,

Nice catch. I'll update the regex for this case.

Thanks,

Andrwe

Here's the updated regex:

    def literals_without_dots(string)       string.gsub(/(['"])(([^\\]|(\\.))*?)(\1)/) {|s| s.gsub('.', '_')}     end

Basically, inside a literal, swallow any non-backslash character, or a backslash followed by anything, until running into the quote that closes the literal.

Does that look better Jeremy?

Thanks,

Andrew

It does. However, on PostgreSQL with the standard_conforming_strings = on setting, or other databases that implement strings in accordance with the SQL standard, it generates wrong results, since the standard doesn't allow for escaping with backslashes. For example, your new code will wrongly escape the following:

'a\' || a.b || ''

Here the \ in the first string doesn't escape the closing quote. So your code will incorrectly change a.b to a_b.

In short, there isn't a simple way to parse SQL correctly with regular expressions, as correct parsing depends not only on the database type used, but also the specific database's configuration. This is the reason Sequel never attempts to parse SQL.

Jeremy

Hmm..

I still think this patch is worth committing.

The current method is clearly wrong in many cases. The patch proposed will fix MySQL and SQLLite 3, as well as the large majority of PostgreSQL cases.

There will still be cases where this method incorrect, but they will be fewer.

Given that ActiveRecord is trying to parse the SQL, let's try and get it as close to correct as possible. (ActiveRecord may want to consider moving this method to the database adapters to provide database specific handling)

It does. However, on PostgreSQL with the standard_conforming_strings = on setting, or other databases that implement strings in accordance with the SQL standard, it generates wrong results, since the standard doesn't allow for escaping with backslashes. For example, your new code will wrongly escape the following:

'a\' || a.b || ''

Here the \ in the first string doesn't escape the closing quote. So your code will incorrectly change a.b to a_b.

In short, there isn't a simple way to parse SQL correctly with regular expressions, as correct parsing depends not only on the database type used, but also the specific database's configuration. This is the reason Sequel never attempts to parse SQL.

Yeah, frankly this was only ever added to ease the migration to the preloading code. Parsing sql or even guessing is a fools errand and it's a never-ending tar-pit of special cases. Ideally by 3.1 we remove this code and just error out if someone does Model.include(:children).where("LOWER(children.name) = ?", something). For those cases you can use joins() or somehow explicitly opt in. If we had a do-over for 3.0, I'd go back and remove it then :slight_smile:

Andrew, instead of extending our mad regexp, maybe the best way is just to use .preload(:bars) not .include(:bars) for your case?

Koz,

Great call on using the .preload(:bars) method.

This reason this came up is via a Rails 2.3 app which does not have the .preload method via ARel. In our case, the wrong result from tables_in_string was causing a complicated query to fail.

In the Rails 2.3 branch, the only way to fix this is in the tables_in_strings method. So I was really patching that and then moving the patch forward to the master.

I'll just monkey patch our app to redefine the tables_in_string method

I implemented a :preload option to find for 2.3: https://rails.lighthouseapp.com/projects/8994/tickets/3683-add-preload-option-to-find

The patch has never been merged, but it still cleanly applies to the 2.3.9 and the forthcoming 2.3.10, so if you freeze Rails into your project it's easy to apply it yourself and get preloading. We use this extensively in one of our apps that has similar problems to yours with the table-guessing, together with the inability of the join method to cope with STI.