Custom Postgres Ubuntu Style

2023-09-29

Ubuntu has a very nice way of organizing multiple versions of Postgres. They all get their own directories, and the commands dispatch to the latest version or something else if you set the PGCLUSTER envvar or give a --cluster option. For instance if you have installed Postgres 14, you will see files in /usr/lib/postgresql/14 and /usr/share/postgresql/14.

In Postgres a single installation is called a “cluster”. It has nothing to do with using multiple machines; it’s just the traditional term for the collection of configuration, data files, a postmaster process listening on a given port and its helper processes, etc.

Elsewhere in the postgres world you say initdb to create a cluster. In Ubuntu you say pg_createcluster. By default Ubuntu creates a cluster named main for each version you install. This gives you directories like /etc/postgresql/14/main (for configuration) and /var/lib/postgresql/14/main (for the data). The log file is /var/log/postgresql/postgresql-14-main.log.

If you want to run an old version of pg_dump, you can say PGCLUSTER=10/main pg_dump --version or pg_dump --cluster=10/main --version. Likewise for pg_restore, etc. (but—sidequest spolier alert—not psql or a couple other things: see the footnote for more).

One command that sadly doesn’t support this is pg_config, which is used to build custom extensions. Personally I just patch my local copy (or actually add a patched version earlier in the path, in my ~/bin), like this:

#!/bin/sh

# If postgresql-server-dev-* is installed, call pg_config from the latest
# available one. Otherwise fall back to libpq-dev's version.
#
# (C) 2011 Martin Pitt <mpitt@debian.org>
# (C) 2014-2016 Christoph Berg <myon@debian.org>
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.

set -e
PGBINROOT="/usr/lib/postgresql/"
#redhat# PGBINROOT="/usr/pgsql-"

# MY CHANGES START HERE
if [ -n "$PGCLUSTER" ]; then
  exec "$PGBINROOT/$PGCLUSTER/bin/pg_config" "$@"
fi
# MY CHANGES END HERE

LATEST_SERVER_DEV=`ls -v $PGBINROOT*/bin/pg_config 2>/dev/null|tail -n1`

if [ -n "$LATEST_SERVER_DEV" ]; then
    exec "$LATEST_SERVER_DEV" "$@"
else
    if [ -x /usr/bin/pg_config.libpq-dev ]; then
        exec /usr/bin/pg_config.libpq-dev "$@"
    else
        echo "You need to install postgresql-server-dev-X.Y for building a server-side extension or libpq-dev for building a client-side application." >&2
        exit 1
    fi
fi

Without those changes you can’t build custom C extensions against old versions of Postgres. I’ve mentioned this in the past in this Stackoverflow answer.

But that’s not what this post is about!

This post is about compiling your own Postgres that you can manage like other Postgres versions on Ubuntu. I want an install that includes my temporal patches, so I can convert my timetracking app to use real temporal features. I want the files to live in the normal places, and I want it to start/stop the normal way.

I’ve been hacking on Postgres for many years (Last May someone at PGCon told me I should stop calling myself a newbie. . . .), and I’ve always used ./configure --prefix=~/local ... to keep a dev installation. But I’ve never used it for anything durable. It’s just handy for make installcheck and psql’ing and attaching a debugger. I blow it away all the time with rm -rf ~/local/pgsql/data && ~/local/bin/initdb -D ~/local/pgsql/data. I crash it all the time because that’s how it goes when I’m writing C. ;-) That’s not where my timetracking data should live.

My first attempt was to build Postgres like this:

version=17devel
./configure \
  'CFLAGS=-ggdb -Og -g3 -fno-omit-frame-pointer' \
  --enable-tap-tests --enable-cassert --enable-debug \
  --prefix=/usr/lib/postgresql/${version} \
  --datarootdir=/usr/share/postgresql/${version}
make clean && make world && sudo make install-world

(I might as well keep some dev stuff in there in case I need it.)

Then as the postgres user I tried this:

postgres@tal:~$ pg_createcluster 17devel main
Error: invalid version '17devel'

Alas!

Ubuntu’s multi-version system is controlled by the postgresql-common package, so I got the source for it by running apt-get source postgresql-common. (You might need to uncomment a deb-src line in /etc/apt/sources.list and run sudo apt-get update.) Grepping for “invalid version” I found the message in pg_createcluster from these lines:

my ($version) = $ARGV[0] =~ /^(\d+\.?\d+)$/;
error "invalid version '$ARGV[0]'" unless defined $version;

Instead of fighting with the system I decided to call it version 30. It worked!

Except I had one last problem:

postgres@tal:~$ psql -p 5443
psql: error: connection to server on socket "/tmp/.s.PGSQL.5443" failed: No such file or directory
        Is the server running locally and accepting connections on that socket?

The issue is that the postgresql-common infrastructure dispatches to the latest tools by default, and our “version 30” psql is looking in the wrong place for a socket file. In postgresql.conf you can see this line:

unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories

And taking a peek we have:

paul@tal:~$ ls -A /var/run/postgresql/
10-main.pg_stat_tmp  13-main.pid           9.4-main.pg_stat_tmp  .s.PGSQL.5433.lock  .s.PGSQL.5437       .s.PGSQL.5440.lock
10-main.pid          14-main.pg_stat_tmp   9.4-main.pid          .s.PGSQL.5434       .s.PGSQL.5437.lock  .s.PGSQL.5441
11-main.pg_stat_tmp  14-main.pid           9.5-main.pg_stat_tmp  .s.PGSQL.5434.lock  .s.PGSQL.5438       .s.PGSQL.5441.lock
11-main.pid          15-main.pid           9.5-main.pid          .s.PGSQL.5435       .s.PGSQL.5438.lock  .s.PGSQL.5442
12-main.pg_stat_tmp  30-main.pid           9.6-main.pg_stat_tmp  .s.PGSQL.5435.lock  .s.PGSQL.5439       .s.PGSQL.5442.lock
12-main.pid          9.3-main.pg_stat_tmp  9.6-main.pid          .s.PGSQL.5436       .s.PGSQL.5439.lock  .s.PGSQL.5443
13-main.pg_stat_tmp  9.3-main.pid          .s.PGSQL.5433         .s.PGSQL.5436.lock  .s.PGSQL.5440       .s.PGSQL.5443.lock

(Yeah I run a lot of versions. :-)

This is one way to fix the problem:

postgres@tal:~$ PGCLUSTER=14/main psql -p 5443
psql (17devel)
Type "help" for help.

postgres=#

But that’s too annoying, and the \d commands are going to be broken because they won’t know how to query the latest pg_* tables. (And by the way, why does psql still say it’s 17devel? I haven’t looked into that yet but it’s suspicious.1) And in fact even using PGCLUSTER=30/main psql still works!

I think it’s a bug in this Perl code from /usr/bin/psql:

# if only a port is specified, look for local cluster on specified port
if ($explicit_port and not $version and not $cluster and not $explicit_host and not $explicit_service) {
    LOOP: foreach my $v (reverse get_versions()) {
        foreach my $c (get_version_clusters $v) {
            my $p = get_cluster_port $v, $c;
            if ($p eq $explicit_port) {
                $version = $v;
                # set PGCLUSTER variable for information
                $ENV{PGCLUSTER} = "$version/$c";
                last LOOP;
            }
        }
    }
}

You can see that it sets $version but not $cluster (just $ENV{PGCLUSTER}). Later if $cluster is set then it will look up the correct socket dir, but it’s only set if we’re explicit. Personally I’m fixing this by adding $cluster = $c; right before the $version = $v line. Then we’ll call get_cluster_socketdir below. It might not be 100% correct but it is good enough for my purposes.

So now I have a custom-patched Postgres running on Ubuntu! I see its /etc files, its data files, and its log file. After systemctl daemon-reload I can start it etc. So I think I’m all set. I’d just better re-run ./configure --prefix=~/local before I forget and re-install something broken on top of it. :-)

If I run into more problems, I’ll update this post.


1 Oh, the answer is simple. From /usr/bin/psql:

# if we have no version yet, use the latest version. If we were called as psql,
# pg_archivecleanup, or pg_isready, always use latest version
if (not $version or $cmdname =~ /^(psql|pg_archivecleanup|pg_isready)$/) {
    my $max_version;
    if ($version and $version < 9.2) { # psql 15 only supports PG 9.2+
        $max_version = 14;
    }
    $version = get_newest_version($cmdname, $max_version);
}

But that means most of the last paragraph was wrong. Since the non-self-compiled tools find the socket file just fine, there must be a better solution than patching psql (which is technically pg_wrapper btw). So we are not done. Stay tuned for the, ahem, sequel!

blog comments powered by Disqus Prev: Rails ActionMailer Internals Next: Rails dirty methods