My question MIGHT be "ActiveRecord play nice with multiple views into a
table?
I ask because my application has a single DateDimension table (heavily
ornamented date objects). Each SalesFact has TWO foreign keys into the
DateDimension table: one for the date a house went on the market, one
for when it was sold [*].
My first model+schema is shown below, but I'm stymied how to construct a
query that joins the SalesFact table with the Dimension tables. More
specifically, how do you craft a :select that includes on_market_date
and a sale_date, since they're both foreign keys into the same
date_dimension table? Extra points if you can do it all with
associations and not drop down into SQL.
If that's not possible, I'm pretty sure I can create multiple views of
DateDimension to eliminate the ambiguity. But if there's a more RoR'ish
way to do this, I'm all ears.
# ===== the models...
class SalesFact < ActiveRecord::Base
belongs_to :address_dimension
belongs_to :on_market_date_dimension
belongs_to :sale_date_dimension
end
class AddressDimension < ActiveRecord::Base
has_many :sales_facts
end
class DateDimension < ActiveRecord::Base
end
class OnMarketDateDimension < DateDimension
has_many :sales_facts
end
class SaleDateDimension < DateDimension
has_many :sales_facts
end
# ===== and the schema...
create_table "sales_facts", :id => false, :force => true do |t|
t.integer "address_dimension_id"
t.integer "on_market_date_dimension_id"
t.integer "sale_date_dimension_id"
t.float "asking_price"
t.float "sale_price"
end
create_table "address_dimensions", :force => true do |t|
t.string "house_number"
t.string "street_name"
... lots and lots of other fields
t.float "latitude"
t.float "longitude"
t.string "postal_code"
end
create_table "date_dimension", :force => true do |t|
t.datetime "datetime"
t.boolean "is_weekend"
t.boolean "is_holiday"
... lots and lots of other fields
t.string "day_name"
t.string "month_name"
t.integer "quarter"
end
# =====
[*] NOTE: Many people say "why don't you just put the two dates directly
into the SalesFact table?" Rather than tilt at that windmill, I'll
simply offer two pointers on the topic:
Data Warehousing (excellent
intro)
The Data Warehouse Toolkit, 3rd Edition - Kimball Group (from the man
himself)
Well first of all are you 100% sure you need OnMarketDateDimension and
SaleDateDimension classes? If you don't have any custom logic in them
you should probably remove them and change associations a bit:
class SalesFact < ActiveRecord::Base
...
belongs_to :on_market_date_dimension, :class_name =>
'DateDimension', :foreign_key => 'on_market_date_dimension_id'
belongs_to :sale_date_dimension, :class_name =>
'DateDimension', :foreign_key => 'sale_date_dimension_id'
end
And I see no reason why you can't just do: SalesFact.first :include =>
[:on_market_date_dimension, :sale_date_dimension]. Or did I
misinterpret what you need?
Well first of all are you 100% sure you need OnMarketDateDimension and
SaleDateDimension classes? If you don't have any custom logic in them
you should probably remove them and change associations a bit:
class SalesFact < ActiveRecord::Base
...
belongs_to :on_market_date_dimension, :class_name =>
'DateDimension', :foreign_key => 'on_market_date_dimension_id'
belongs_to :sale_date_dimension, :class_name =>
'DateDimension', :foreign_key => 'sale_date_dimension_id'
end
And I see no reason why you can't just do: SalesFact.first :include =>
[:on_market_date_dimension, :sale_date_dimension]. Or did I
misinterpret what you need?
Andrius: I your solution is very close. I agree that it does not make
sense to create OnMarketDateDimension and SalesDateDimension classes (or
sql views onto a table, as I was suggesting). I've knocked together a
project implementing your approach, but I haven't grok'ed how to make a
query -- maybe you can help.
I get an un-embellished SalesFact, which doesn't do me much good -- I'd
like join with the two dates to which it refers. If I try the same
thing with a :select:
I get an SQL error because Rails is generating a query based on
"sales_fact.id", which we intentionally suppressed in the schema (:id =>
false).
So my question: Given a SalesFact (or a group of SalesFacts), what's the
AR.find() syntax for a query that does a join against
:on_market_date_dimension and :sale_date_dimension?
(i.e. a three-way join on the three foreign keys of a SalesFact record,
and only selecting specific columns in the result).
Here's the SQL it generates, which fails because there is no
on_market_date_dimensions table:
SELECT sales_facts.asking_price, on_market_date_dimensions.datetime,
sale_date_dimensions.datetime, address_dimensions.street_name
FROM `sales_facts`
INNER JOIN `date_dimensions`
ON `date_dimensions`.id = `sales_facts`.on_market_date_dimension_id
INNER JOIN `date_dimensions` sale_date_dimensions_sales_facts
ON `sale_date_dimensions_sales_facts`.id =
`sales_facts`.sale_date_dimension_id
INNER JOIN `address_dimensions`
ON `address_dimensions`.id = `sales_facts`.address_dimension_id
LIMIT 1
I thought the point of @Andrius's approach was to give the model enough
info so that it would refer to the date_dimension table and not the
(non-existent) on_market_date_dimension table.
Just do SalesFact.first(:include =>
[:on_market_date_dimension, :sale_date_dimension]). Then you can
access dates with sales_fact.on_market_date_dimension and
sales_fact.sale_date_dimension.
Just do SalesFact.first(:include =>
[:on_market_date_dimension, :sale_date_dimension]). Then you can
access dates with sales_fact.on_market_date_dimension and
sales_fact.sale_date_dimension.
Ah - yes, that would work! But am I wrong in thinking that pushes all
the logic out of the DB and into Rails? For example, consider the
following query that shows only sales that closed in Q1:
SELECT sale.asking_price, sale.sale_price, listed.datetime AS
listed_date, sold.datetime AS sold_date
FROM sales_facts sale
JOIN date_dimensions listed ON listed.id =
sale.on_market_date_dimension_id
JOIN date_dimensions sold ON sold.id = sale.sale_date_dimension_id AND
sold.quarter = "Q1";
If I do an :include, I'm not sure how to keep the sold.quarter = "Q1"
within SQL. I'm probably missing something...