Postgres string_agg Function

2012-06-28

I discovered yesterday some amazing functions available in PostgreSQL when you’re using a GROUP BY query. I’ve wanted these functions since at least 2003, when I was working mostly with Oracle, and now it turns out that Postgres has them! Imagine you want a report on duplicate item names in your database, so you write a query like this:

SELECT  name, COUNT(*)
FROM    items
GROUP BY name
HAVING  COUNT(*) > 1;

That will show you the duplicate names, but you’d really like a list of their IDs also. You could do this with a sub-query, but what if there were a function that could take all the grouped items and concatenate them? This function would be sort of like SUM, but for strings (or things you could coerce to a string). Even better would be if it works like the join method you find in Perl/Python/Ruby. Well, Postgres has this function:

SELECT  string_agg(id::text, ', '), name, COUNT(*)
FROM    items
GROUP BY name
HAVING  COUNT(*) > 1;

The string_agg function takes two strings: first the strings to join, and second the delimiter. (Here we have to cast id to text because it is an integer type.) So now you can see the individual IDs of your duplicates, which makes it easy to operate on them if that’s what you want. Of course these functions are useful in more cases than just finding duplicates, but finding duplicates is a simple example.

There are similar methods called array_agg and xmlagg to get things in non-string form.

blog comments powered by Disqus Prev: Skipping rescue blocks in Ruby Next: Ruby Recursive/Deep Merge