Paul A. Jungwirth
PDX.rb
May 2015
Interesting things you can do in SQL.
Follow along:
restaurants --< inspections --< violations ----------- ----------- ---------- id id id name restaurant_id inspection_id score name inspected_at
r.name i.score i.inspected_at v.name ------ ------- -------------- ------ Bob's Diner 77 2015-01-07 Rats Bob's Diner 71 2015-01-15 Rats Bob's Diner 71 2015-01-15 Zombies Bob's Diner 75 2015-03-15 Bats Joe's Place Crystal Palace 100 2015-01-07
Courtesy of Robb Schecter at eaternet.io.
SELECT id, name FROM inspections WHERE score > 80 ORDER BY inspected_at DESC
SELECT date_trunc('month', inspected_at) AS m, MAX(score) AS winner FROM inspections GROUP BY m HAVING MAX(score) > 90 ORDER BY m
@inspections = Inspection.where("score > 80"). order(inspected_at: :desc) @violations = @inspection.violations.where(name: "Rats") puts @inspection.violations.where(name: "Rats").to_sql
Inspection.where(score: 90) Inspection.where("score > ?", 90) Inspection.where("inspections.score > ?", 90) Restaurant.where(name: "This is safe") Restaurant.where("name = ?", "Still safe") Restaurant.where("name = '#{oops}'")
SELECT * FROM restaurants r INNER JOIN inspections i ON i.restaurant_id = r.id
SELECT * FROM restaurants r LEFT OUTER JOIN inspections i ON i.restaurant_id = r.id
JOIN table ON condition
INNER JOIN throws away rows with no match.
LEFT OUTER JOIN constructs all-NULL rows if no match.
Where to start??
FROM
.Equivalent with inner joins:
SELECT * FROM restaurants r INNER JOIN inspections i ON i.restaurant_id = r.id AND i.score > 80
equals:
SELECT * FROM restaurants r INNER JOIN inspections i ON i.restaurant_id = r.id WHERE i.score > 80
Different with outer joins!
SELECT * FROM restaurants r LEFT OUTER JOIN inspections i ON i.restaurant_id = r.id AND i.score > 80
returns more rows than:
SELECT * FROM restaurants r LEFT OUTER JOIN inspections i ON i.restaurant_id = r.id WHERE i.score > 80
ON
fitlers.NULL
rows where necessary.WHERE
filters.SELECT r.name, AVG(i.score) FROM restaurants r LEFT OUTER JOIN inspections i ON i.restaurant_id = r.id LEFT OUTER JOIN violations v ON v.inspection_id = i.id GROUP BY r.id
AVG
will be wrong.joins
gives an inner join:
@good_restaurants = Restaurant.joins(:inspections). where("inspections.score > ?", 80)
- @good_restaurants.each do |r| %li= r.name + ": " + r.inspections.first.score
includes
gives a LEFT OUTER JOIN (in effect):
@restaurants = Restaurant.includes(:inspections) # .where("inspections.score > ?", 80)
might produce:
SELECT * FROM restaurants LEFT OUTER JOIN inspections ON inspections.restaurant_id = restaurants.id WHERE
Sometimes runs a second query.
Either way avoids the "n+1" problem.
class Restaurant def inspections_with_violation_counts inspections. select("inspections.*"). select("COUNT(violations.id) AS violations_count"). joins(<<-EOQ). LEFT OUTER JOIN violations ON violations.inspection_id = inspections.id EOQ group("inspections.id") end end
Inspection.find_by_sql([<<-EOQ, 1.2]) SELECT *, score * ? AS curved_score FROM inspections EOQ
?
params okay but requires []
.:foo
params okay too.connection.select_rows("SELECT a, b FROM bar") # [["1","2"], ["4","5"]] connection.select_all("SELECT a, b FROM bar") # #<ActiveRecord::Result:0x007ff34710a5f8> connection.select_all("SELECT a, b, c FROM bar").to_a # [{"a"=>"1", "b"=>"2"}, {"a"=>"4","b"=>"5"}]
ActiveRecord::Base.connection
outside models and migrations.connection.select_values("SELECT id FROM inspections") # ["1", "4"] connection.select_value("SELECT MIN(id) FROM inspections") # ["1"] connection.select_one("SELECT MIN(id) AS id FROM inspections") # {"id"=>"1"}
connection.select_rows <<-EOQ SELECT id, name FROM inspections WHERE name = '#{connection.quote_string(foo)}' OR id = #{bar.to_i} EOQ
execute
, select_*
, joins
: no ?
paramsquote_string
or to_i
to avoid SQLi.scope
class Inspection scope :passing, where("score > 80") scope :between, -> {|start_day, end_day| where("inspected_at BETWEEN ? AND ?", start_day, end_day) } end
All restaurants with no inspections yet:
SELECT * FROM restaurants r WHERE NOT EXISTS (SELECT 1 FROM inspections i WHERE i.restaurant_id = r.id)
NOT IN
.scope
scope :without_inspection, -> { where(<<-EOQ) NOT EXISTS (SELECT 1 FROM inspections i WHERE i.restaurant_id = restaurants.id) EOQ }
class Inspection scope :perfect, -> { where(score: 100) } end class Restaurant scope :with_a_perfect_score, -> { joins(:inspections).merge(Inspection.perfect) } end
EXISTS
to avoid the join.Count the inspections in each month of 2015:
SELECT EXTRACT('MONTH' FROM inspected_at) m, COUNT(*) FROM inspections WHERE EXTRACT('YEAR' FROM inspected_at) = 2015 GROUP BY m ORDER BY m
SELECT s.m + 1 AS month, COUNT(id) FROM generate_series(0, 11) s(m) LEFT OUTER JOIN inspections ON inspected_at BETWEEN '2015-01-01'::date + (s.m || ' MONTHS')::interval AND '2015-01-01'::date + ((s.m + 1) || ' MONTHS')::interval - INTERVAL '1 DAY' GROUP BY s.m ORDER BY s.m
FROM
: each output row is a month.from
class Restaurant def inspections_per_month(year) Inspection. select("s.m + 1 AS inspection_month"). select("COUNT(inspections.id) AS inspections_count"). from("generate_series(0, 11) s(m)"). joins(<<-EOQ). LEFT OUTER JOIN inspections ON inspections.inspected_at BETWEEN '#{year.to_i}-01-01'::date + (s.m || ' MONTHS')::interval AND '#{year.to_i}-01-01'::date + ((s.m + 1) || ' MONTHS')::interval - INTERVAL '1 DAY' AND inspections.restaurant_id = #{id.to_i} EOQ group("inspection_month"). order("inspection_month") end end
from
to override the normal table.Or RIGHT OUTER JOIN
Possible to say inspections
rather than Inspection
?
CTE: Common Table Expression
WITH t(month) AS ( SELECT '2015-01-01'::date + (s.m || ' MONTHS')::interval FROM generate_series(0, 11) s(m) ) SELECT t.month, COUNT(id) FROM t LEFT OUTER JOIN inspections ON inspected_at BETWEEN month AND month + INTERVAL '1 MONTH' - INTERVAL '1 DAY' GROUP BY month ORDER BY month
INSERT
, UPDATE
, one-time functions.RECURSIVE
: tree-like stuff.with
WITH
in ActiveRecord.postgres_ext
gem.Inspection. with("t(month)" => <<-EOQ). SELECT '2015-01-01'::date + (s.m || ' MONTHS')::interval FROM generate_series(0, 11) s(m) EOQ select("t.month"). select("COUNT(inspections.id) AS inspections_count"). from("t"). joins(<<-EOQ). LEFT OUTER JOIN inspections ON inspections.inspected_at BETWEEN t.month AND t.month + INTERVAL '1 MONTH' - INTERVAL '1 DAY' AND inspections.restaurant_id = #{id.to_i} EOQ group("t.month"). order("t.month")
Define a FIRST
aggregate function:
def up # https://wiki.postgresql.org/wiki/First/last_%28aggregate%29 connection.execute <<-EOQ CREATE OR REPLACE FUNCTION public.first_agg ( anyelement, anyelement ) RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$ SELECT $1; $$; EOQ connection.execute <<-EOQ CREATE AGGREGATE public.first ( sfunc = public.first_agg, basetype = anyelement, stype = anyelement ); EOQ end
scope :with_latest_score, -> { select("restaurants.*"). select(<<-EOQ). FIRST( inspections.score ORDER BY inspections.inspected_at DESC ) AS latest_score EOQ joins(<<-EOQ). LEFT OUTER JOIN inspections ON inspections.restaurant_id = restaurants.id EOQ group("restaurants.id") }
ORDER BY
.SELECT r.name, latest.inspected_at, latest.score FROM restaurants r LEFT OUTER JOIN LATERAL ( SELECT * FROM inspections i WHERE i.restaurant_id = r.id ORDER BY i.inspected_at DESC LIMIT 1 ) latest ON true ORDER BY latest.score DESC NULLS LAST
ON
is a formality.Restaurant. select(:name). select("latest.inspected_at AS latest_inspected_at"). select("latest.score AS latest_score"). joins(<<-EOQ). LEFT OUTER JOIN LATERAL ( SELECT * FROM inspections i WHERE i.restaurant_id = restaurants.id ORDER BY i.inspected_at DESC LIMIT 1 ) latest ON true EOQ order("latest.score DESC NULLS LAST")
joins
into a scope.Compare each inspection score to that restaurant's average score:
SELECT i.inspected_at, r.name, i.score, AVG(i.score) OVER (PARTITION BY i.restaurant_id) FROM inspections i INNER JOIN restaurants r ON r.id = i.restaurant_id ORDER BY i.inspected_at, r.name
rank
, ntile
, etc.select
.WITH latest AS ( SELECT restaurant_id, rank() OVER ( PARTITION BY restaurant_id ORDER BY inspected_at DESC ) AS rank, first_value(score) OVER ( PARTITION BY restaurant_id ORDER BY inspected_at DESC ) AS score FROM inspections ) SELECT r.name, latest.score FROM restaurants r LEFT OUTER JOIN latest ON latest.restaurant_id = r.id AND latest.rank = 1 ORDER BY latest.score
latest.rank
has to be outside the CTE/subquery).recursive CTEs
more on window functions
json
serialize
: Rails 3 vs 4hstore
arrays
Rails migrations
INSERT INTO . . . RETURNING
Read More:
This talk:
/
#