I discovered yesterday some amazing functions available in PostgreSQL when you’re using a
GROUP BY query. I’ve wanted these functions since at least 2003, when I was working mostly with Oracle, and now it turns out that Postgres has them! Imagine you want a report on duplicate item names in your database, so you write a query like this:
SELECT name, COUNT(*) FROM items GROUP BY name HAVING COUNT(*) > 1;
That will show you the duplicate names, but you’d really like a list of their IDs also. You could do this with a sub-query, but what if there were a function that could take all the grouped items and concatenate them? This function would be sort of like SUM, but for strings (or things you could coerce to a string). Even better would be if it works like the
join method you find in Perl/Python/Ruby. Well, Postgres has this function:
SELECT string_agg(id::text, ', '), name, COUNT(*) FROM items GROUP BY name HAVING COUNT(*) > 1;
string_agg function takes two strings: first the strings to join, and second the delimiter. (Here we have to cast
id to text because it is an integer type.) So now you can see the individual IDs of your duplicates, which makes it easy to operate on them if that’s what you want. Of course these functions are useful in more cases than just finding duplicates, but finding duplicates is a simple example.
There are similar methods called
xmlagg to get things in non-string form.