I've submitted a patch against both 2-3-stable and master that better
deals with determining the tables referenced in a SQL string.
Before, periods in literals would confuse this parsing, leading the
extraneous and/or non-existent tables being returned by
tables_in_string.
This patch takes the approach of replacing all periods inside literals
with underscores before scanning the SQL string.
Test cases are also present.
Please take a look.
Thanks,
Andrew
It doesn't appear to handle literals with embedded, escaped quotes
(e.g. 'a\'.b') correctly.
Jeremy
Jeremy,
Nice catch. I'll update the regex for this case.
Thanks,
Andrwe
Here's the updated regex:
def literals_without_dots(string)
string.gsub(/(['"])(([^\\]|(\\.))*?)(\1)/) {|s| s.gsub('.',
'_')}
end
Basically, inside a literal, swallow any non-backslash character, or a
backslash followed by anything, until running into the quote that
closes the literal.
Does that look better Jeremy?
Thanks,
Andrew
It does. However, on PostgreSQL with the standard_conforming_strings
= on setting, or other databases that implement strings in accordance
with the SQL standard, it generates wrong results, since the standard
doesn't allow for escaping with backslashes. For example, your new
code will wrongly escape the following:
'a\' || a.b || ''
Here the \ in the first string doesn't escape the closing quote. So
your code will incorrectly change a.b to a_b.
In short, there isn't a simple way to parse SQL correctly with regular
expressions, as correct parsing depends not only on the database type
used, but also the specific database's configuration. This is the
reason Sequel never attempts to parse SQL.
Jeremy
Hmm..
I still think this patch is worth committing.
The current method is clearly wrong in many cases. The patch proposed
will fix MySQL and SQLLite 3, as well as the large majority of
PostgreSQL cases.
There will still be cases where this method incorrect, but they will
be fewer.
Given that ActiveRecord is trying to parse the SQL, let's try and get
it as close to correct as possible. (ActiveRecord may want to consider
moving this method to the database adapters to provide database
specific handling)
It does. However, on PostgreSQL with the standard_conforming_strings
= on setting, or other databases that implement strings in accordance
with the SQL standard, it generates wrong results, since the standard
doesn't allow for escaping with backslashes. For example, your new
code will wrongly escape the following:
'a\' || a.b || ''
Here the \ in the first string doesn't escape the closing quote. So
your code will incorrectly change a.b to a_b.
In short, there isn't a simple way to parse SQL correctly with regular
expressions, as correct parsing depends not only on the database type
used, but also the specific database's configuration. This is the
reason Sequel never attempts to parse SQL.
Yeah, frankly this was only ever added to ease the migration to the
preloading code. Parsing sql or even guessing is a fools errand and
it's a never-ending tar-pit of special cases. Ideally by 3.1 we
remove this code and just error out if someone does
Model.include(:children).where("LOWER(children.name) = ?", something).
For those cases you can use joins() or somehow explicitly opt in. If
we had a do-over for 3.0, I'd go back and remove it then
Andrew, instead of extending our mad regexp, maybe the best way is
just to use .preload(:bars) not .include(:bars) for your case?
Koz,
Great call on using the .preload(:bars) method.
This reason this came up is via a Rails 2.3 app which does not have
the .preload method via ARel. In our case, the wrong result from
tables_in_string was causing a complicated query to fail.
In the Rails 2.3 branch, the only way to fix this is in the
tables_in_strings method. So I was really patching that and then
moving the patch forward to the master.
I'll just monkey patch our app to redefine the tables_in_string method
I implemented a :preload option to find for 2.3:
https://rails.lighthouseapp.com/projects/8994/tickets/3683-add-preload-option-to-find
The patch has never been merged, but it still cleanly applies to the
2.3.9 and the forthcoming 2.3.10, so if you freeze Rails into your
project it's easy to apply it yourself and get preloading. We use
this extensively in one of our apps that has similar problems to yours
with the table-guessing, together with the inability of the join
method to cope with STI.