In Postgres development it’s normal for patch attempts to require many revisions and last a long time. I just sent in v17 of my SQL:2011 application time patch. The commitfest entry dates back to summer of 2021, but it’s really a continuation of this thread from 2018. And it’s not yet done.
My work on multiranges is a similar story: 1.5 years from first patch to committed.
Today I saw this post by Julia Evans about problems people have with git rebase (also see the hn discussion), and it reminded me of my struggles handling long-lived branches.
In my early days with git I avoided rebasing, because I wanted the history to be authentic. Nowaday I rebase pretty freely, both to move my commits on top of the latest master
branch work and to interactively clean things up so the commits show logical progress (with generous commit messages explaining the motivation and broad design decisions: the “why”).
But in my paid client work, PRs get merged pretty fast. There is nothing like the multi-year wait of Postgres hacking. Often I’ve wished for more history there. It’s not my day job, so it’s hard to remember fine details about something from months or years ago. And I’ve changed direction a couple times, and sometimes I want a way to consult that old history.
But with Postgres you don’t have any choice but to rebase. You send your patch files to a mailing list, and if they don’t apply cleanly no one will look at them. I’ve spent hours and hours rebasing patches because the underlying systems changed before they could get committed.
With multiranges this was tough, but at least it was just one patch file. Application time is a series of five patches, which over time have changed order and evolved from four. When it’s time to send a new version, I run git format-patch
, which turns each commit into a .patch
file. So I need to wind up with five well-groomed commits rebased on the latest master
.
My personal copy of the postgres repo on github has a bunch of silly-named branches for stashing work when I want to change direction, so the history isn’t totally lost. But for a long time I had no system. It feels like when you see a spreadsheet named Annual Report - Copy of Jan 7.bak - final - FINAL.xls
. After all these years it’s unmanageable. (Okay at least I know not to name any Postgres submission “final”! ;-)
I think I finally found a way to keep history that works for me. On my main valid-time
branch I keep a series of commits for each small change. I rebase to move them up and down, so that they will squash cleanly into the five commits I need at the end. You can see that I have one main commit for each of the five patches, but each is followed by many commits named fixup pks: fixed this
or fixup fks: feedback from so-and-so
. I rebase on master
every so often. I force-push all the time, since no one else uses the repo. (I do work on both a laptop and a desktop though, so I have to remember to git fetch && git reset --hard origin/valid-time
.)
When I’m ready to submit new patches, I take a snapshot with git checkout -b valid-time-v17-pre-squash
and “make a backup” with git push -u
. Then I make a branch to squash things (git checkout -b valid-time-v17
). I do a git rebase -i HEAD~60
, press *
on pick
, type cw fixup
, then n.n.n.n.n.n.
, etc. ’til I have just the five commits. Then I have a script to do a clean build + test on each commit, since I want things to work at every point. While that’s running I write the email about the new patch, and hopefully send it in.
So now I’m capturing the fine-grained history that went into each submission, and that won’t change no matter how aggressively I rebase the current work. I’m pretty happy with this flow. I wish I had started years ago.
One git feature I could almost use is git rebase -i --autosquash
. (Here are some articles about it.) If your commit messages are named fixup! foo
, then git will automatically set those commits to fixup
, not pick
, and it will move them to just below whatever commit matches foo
. I follow this pattern but with fixup
not fixup!
, to keep it all manual. At first I just didn’t trust it (or myself).
Now I’m ready to move to this workflow, but I’m not sure how to “match” one of my five main commits. I want a meaningful title (i.e. the first line of the commit message) for each little commit, so I use short abbreviations for the patch they target, e.g. fixup pks: Add documentation for pg_constraint.contemporal column
. Git doesn’t know that it should match pks
to Add temporal PRIMARY KEY and UNIQUE constraints
and ignore everything after the colon. If there were a way to preserve tags after a rebase I think I could tag the main commit as pks
and it might work (but maybe not with the extra stuff after the colon).
You can have git generate the new commit message for you with git commit --fixup $sha
, but it just copies the whole title verbatim, which is not what I want. Also who wants to remember $sha
for those five parent commits? And finally, I want to move these commits into place immediately, so I can build & test against each patch as I work. Git can’t move them for me without squashing them.
The Thoughtbot article linked above says you can use a regex, e.g. git commit --fixup :/pks
, but: (1) The regex is used immediately to find the parent, but it gets replaced with that parent’s title. It doesn’t stay in your commit message. (2) If you give an additional commit message, it goes two lines below the fixup!
line, so it’s not in the commit title. This only solves having to remember $sha
.
What I really want is fixup! ^: blah blah blah
where ^
means “the closest non-squashed parent”, and the ^
is resolved at rebase time, not commit time, and everything after the colon is not used for matching. (If it needs to be a regex then :/.
is sufficient too.)
Anyway I’m using my manual process for now, since with vim I can change 60 pick
s to fixup
in a few seconds. I’m not willing to lose meaningful titles to save a few seconds with fixup!
.
Nonetheless it would be nice to have one less step I have to remember. Involuntarily I keep thinking about how I can make this feature work for me. If someone has a suggestion, please do let me know.
Another approach is “stacked commits”. I went as far as installing git branchless and reading the docs and some articles, but to be honest I never went beyond a few tests, and I haven’t thought about it for a few months. It’s in the back of my head to give it a more honest effort.
When it comes to sending email in Rails, I’ve wondered for years about the gap between this:
class UserMailer < ApplicationMailer
def welcome(user)
@user = user
mail(to: user.email)
end
end
and this:
class User
def send_welcome_notification
UserMailer.welcome(self).deliver_later
end
end
We are defining an instance method, but we are calling a class method. What’s going on there? I finally decided to take a closer look.
Well naturally this is implemented by method_missing
. When you call UserMailer.welcome
, the class will call your instance method—sort of! Actually method_missing
just returns a MessageDelivery
object, which provides lazy evaluation. It’s like a promise (but not asynchronous). Your method doesn’t get called until you resolve the “promise,” which normally would happen when you say deliver_now
. You can also call #message
which must resolve the promise (and returns whatever your method returned—sort of!).
What if you say deliver_later
? That still doesn’t call your method. Instead it queues up a job, and later that will say deliver_now
to finally call your method.
But if you’re using Sidekiq (with config.active_job.queue_adapter = :sidekiq
), you might wonder how that #welcome
method works, since we’re passing a User
class and Sidekiq can only serialize primitive types. But it does work! The trick is that Rails’ queue adapter for Sidekiq does its own serialization before handing off the job to Sidekiq, and it tells Sidekiq to run its own Worker subclass that will deserialize things correctly.
All this assumes that your mailer method returns a Mail::Message
instance. That’s what #mail
is giving you. But what if you don’t? What if you call mail
but not as the last line of your method? What if you call it more than once?
Well actually #mail
(linking to the source code this time) remembers the message it generated, so even if you don’t return that from your own method, Rails will still send it properly. In fact it doesn’t matter what your own method returns!
And if you call #mail
multiple times, then Rails will return early and do nothing for the second and third calls—sort of! If you pass any arguments or a block, then Rails will evaluate it again. But it still only knows how to store one Message
. So when you finally call deliver_now
, only one email will go out (ask me how I know).
Btw it turns out this is pretty much all documented on the ActionMailer::Base
class, but it’s not really covered in the Rails Guide, so I never came across it. I only found those docs when I decided to read the code. I don’t know if other Rails devs spend much time reading Rails’ own code, but I’ve found it helpful again and again. It’s not hard and totally worth it!
Another trick I’ve used for years is bundle show actionmailer
(or in the old days cd $(bundle show actionmailer)
, before they broke that with a deprecation notice), and then you can add pp
or binding.pry
wherever you like. It’s a great way to test your understanding of what’s happening or discover the internals of something.
Ubuntu has a very nice way of organizing multiple versions of Postgres. They all get their own directories, and the commands dispatch to the latest version or something else if you set the PGCLUSTER
envvar or give a --cluster
option. For instance if you have installed Postgres 14, you will see files in /usr/lib/postgresql/14
and /usr/share/postgresql/14
.
In Postgres a single installation is called a “cluster”. It has nothing to do with using multiple machines; it’s just the traditional term for the collection of configuration, data files, a postmaster process listening on a given port and its helper processes, etc.
Elsewhere in the postgres world you say initdb
to create a cluster. In Ubuntu you say pg_createcluster
. By default Ubuntu creates a cluster named main
for each version you install. This gives you directories like /etc/postgresql/14/main
(for configuration) and /var/lib/postgresql/14/main
(for the data). The log file is /var/log/postgresql/postgresql-14-main.log
.
If you want to run an old version of pg_dump
, you can say PGCLUSTER=10/main pg_dump --version
or pg_dump --cluster=10/main --version
. Likewise for pg_restore
, etc. (but—sidequest spolier alert—not psql
or a couple other things: see the footnote for more).
One command that sadly doesn’t support this is pg_config
, which is used to build custom extensions. Personally I just patch my local copy (or actually add a patched version earlier in the path, in my ~/bin
), like this:
#!/bin/sh
# If postgresql-server-dev-* is installed, call pg_config from the latest
# available one. Otherwise fall back to libpq-dev's version.
#
# (C) 2011 Martin Pitt <mpitt@debian.org>
# (C) 2014-2016 Christoph Berg <myon@debian.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
set -e
PGBINROOT="/usr/lib/postgresql/"
#redhat# PGBINROOT="/usr/pgsql-"
# MY CHANGES START HERE
if [ -n "$PGCLUSTER" ]; then
exec "$PGBINROOT/$PGCLUSTER/bin/pg_config" "$@"
fi
# MY CHANGES END HERE
LATEST_SERVER_DEV=`ls -v $PGBINROOT*/bin/pg_config 2>/dev/null|tail -n1`
if [ -n "$LATEST_SERVER_DEV" ]; then
exec "$LATEST_SERVER_DEV" "$@"
else
if [ -x /usr/bin/pg_config.libpq-dev ]; then
exec /usr/bin/pg_config.libpq-dev "$@"
else
echo "You need to install postgresql-server-dev-X.Y for building a server-side extension or libpq-dev for building a client-side application." >&2
exit 1
fi
fi
Without those changes you can’t build custom C extensions against old versions of Postgres. I’ve mentioned this in the past in this Stackoverflow answer.
But that’s not what this post is about!
This post is about compiling your own Postgres that you can manage like other Postgres versions on Ubuntu. I want an install that includes my temporal patches, so I can convert my timetracking app to use real temporal features. I want the files to live in the normal places, and I want it to start/stop the normal way.
I’ve been hacking on Postgres for many years (Last May someone at PGCon told me I should stop calling myself a newbie. . . .), and I’ve always used ./configure --prefix=~/local ...
to keep a dev installation. But I’ve never used it for anything durable. It’s just handy for make installcheck
and psql’ing and attaching a debugger. I blow it away all the time with rm -rf ~/local/pgsql/data && ~/local/bin/initdb -D ~/local/pgsql/data
. I crash it all the time because that’s how it goes when I’m writing C. ;-) That’s not where my timetracking data should live.
My first attempt was to build Postgres like this:
version=17devel
./configure \
'CFLAGS=-ggdb -Og -g3 -fno-omit-frame-pointer' \
--enable-tap-tests --enable-cassert --enable-debug \
--prefix=/usr/lib/postgresql/${version} \
--datarootdir=/usr/share/postgresql/${version}
make clean && make world && sudo make install-world
(I might as well keep some dev stuff in there in case I need it.)
Then as the postgres
user I tried this:
postgres@tal:~$ pg_createcluster 17devel main
Error: invalid version '17devel'
Alas!
Ubuntu’s multi-version system is controlled by the postgresql-common
package, so I got the source for it by running apt-get source postgresql-common
. (You might need to uncomment a deb-src
line in /etc/apt/sources.list
and run sudo apt-get update
.) Grepping for “invalid version” I found the message in pg_createcluster
from these lines:
my ($version) = $ARGV[0] =~ /^(\d+\.?\d+)$/;
error "invalid version '$ARGV[0]'" unless defined $version;
Instead of fighting with the system I decided to call it version 30. It worked!
Except I had one last problem:
postgres@tal:~$ psql -p 5443
psql: error: connection to server on socket "/tmp/.s.PGSQL.5443" failed: No such file or directory
Is the server running locally and accepting connections on that socket?
The issue is that the postgresql-common infrastructure dispatches to the latest tools by default, and our “version 30” psql is looking in the wrong place for a socket file. In postgresql.conf
you can see this line:
unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
And taking a peek we have:
paul@tal:~$ ls -A /var/run/postgresql/
10-main.pg_stat_tmp 13-main.pid 9.4-main.pg_stat_tmp .s.PGSQL.5433.lock .s.PGSQL.5437 .s.PGSQL.5440.lock
10-main.pid 14-main.pg_stat_tmp 9.4-main.pid .s.PGSQL.5434 .s.PGSQL.5437.lock .s.PGSQL.5441
11-main.pg_stat_tmp 14-main.pid 9.5-main.pg_stat_tmp .s.PGSQL.5434.lock .s.PGSQL.5438 .s.PGSQL.5441.lock
11-main.pid 15-main.pid 9.5-main.pid .s.PGSQL.5435 .s.PGSQL.5438.lock .s.PGSQL.5442
12-main.pg_stat_tmp 30-main.pid 9.6-main.pg_stat_tmp .s.PGSQL.5435.lock .s.PGSQL.5439 .s.PGSQL.5442.lock
12-main.pid 9.3-main.pg_stat_tmp 9.6-main.pid .s.PGSQL.5436 .s.PGSQL.5439.lock .s.PGSQL.5443
13-main.pg_stat_tmp 9.3-main.pid .s.PGSQL.5433 .s.PGSQL.5436.lock .s.PGSQL.5440 .s.PGSQL.5443.lock
(Yeah I run a lot of versions. :-)
This is one way to fix the problem:
postgres@tal:~$ PGCLUSTER=14/main psql -p 5443
psql (17devel)
Type "help" for help.
postgres=#
But that’s too annoying, and the \d
commands are going to be broken because they won’t know how to query the latest pg_*
tables. (And by the way, why does psql still say it’s 17devel
? I haven’t looked into that yet but it’s suspicious.1) And in fact even using PGCLUSTER=30/main psql
still works!
I think it’s a bug in this Perl code from /usr/bin/psql
:
# if only a port is specified, look for local cluster on specified port
if ($explicit_port and not $version and not $cluster and not $explicit_host and not $explicit_service) {
LOOP: foreach my $v (reverse get_versions()) {
foreach my $c (get_version_clusters $v) {
my $p = get_cluster_port $v, $c;
if ($p eq $explicit_port) {
$version = $v;
# set PGCLUSTER variable for information
$ENV{PGCLUSTER} = "$version/$c";
last LOOP;
}
}
}
}
You can see that it sets $version
but not $cluster
(just $ENV{PGCLUSTER}
). Later if $cluster
is set then it will look up the correct socket dir, but it’s only set if we’re explicit. Personally I’m fixing this by adding $cluster = $c;
right before the $version = $v
line. Then we’ll call get_cluster_socketdir
below. It might not be 100% correct but it is good enough for my purposes.
So now I have a custom-patched Postgres running on Ubuntu! I see its /etc
files, its data files, and its log file. After systemctl daemon-reload
I can start it etc. So I think I’m all set. I’d just better re-run ./configure --prefix=~/local
before I forget and re-install something broken on top of it. :-)
If I run into more problems, I’ll update this post.
1 Oh, the answer is simple. From /usr/bin/psql
:
# if we have no version yet, use the latest version. If we were called as psql,
# pg_archivecleanup, or pg_isready, always use latest version
if (not $version or $cmdname =~ /^(psql|pg_archivecleanup|pg_isready)$/) {
my $max_version;
if ($version and $version < 9.2) { # psql 15 only supports PG 9.2+
$max_version = 14;
}
$version = get_newest_version($cmdname, $max_version);
}
But that means most of the last paragraph was wrong. Since the non-self-compiled tools find the socket file just fine, there must be a better solution than patching psql
(which is technically pg_wrapper
btw). So we are not done. Stay tuned for the, ahem, sequel!
Rails has lots of methods to see what attributes have changed on your model. Some tell you the changes you haven’t yet saved; some, the changes you just saved. But the behavior and names of these attributes have changed over time.
I thought I had a handle on this until I saw saved_change_to_attribute?
and wondered how it differs from attribute_previously_changed?
. Turns out they are identical!
Well sort of. The spelling I’m used to, attribute_previously_changed?
, comes from ActiveModel::Dirty
(and is a bit older), whereas saved_change_to_attribute?
is defined in ActiveRecord::AttributeMethods::Dirty
. Not all ActiveModels are ActiveRecords. But in your ActiveRecord classes, they do the same thing.
I’ve linked to Rails 6.1 here. They were nearly identical before that, but for a while one took extra options and the other didn’t. You have to go back to Rails 5.0 to get a more substantial difference, when we had attribute_previously_changed?
but not saved_change_to_attribute?
. They are still identical today in Rails 7. I’m surprised they don’t deprecate the ActiveRecord methods and just use ActiveModel.
Just to give a quick catalog, here is the full set of methods. Anywhere you see attribute
you can replace it with the name of the attribute you care about (which just calls the generic method with its name as parameter).
before you save:
changes
changed_attributes # can't replace "attribute"
attribute_change
attribute_changed?
attribute_was
changes_to_save
has_changes_to_save?
attributes_in_database # can't replace "attribute"
attribute_in_database
changed_attribute_names_to_save # can't replace "attribute"
attribute_change_to_be_saved
will_save_change_to_attribute?
after you save:
previous_changes
attribute_previous_change
attribute_previously_changed?
attribute_previously_was
saved_changes
saved_changes?
saved_change_to_attribute
saved_change_to_attribute?
attribute_before_last_save
I’ve grouped the methods from each file, and you can see there are many synonyms.
By the way if you are making heavy use of ActiveRecord callbacks and using these methods to trigger them (e.g. after_commit :send_shipped_notification if: :shipped_at_previously_changed?
), watch out! The conditions on these get evaluated one-by-one, so if some earlier callback saves further changes to the model, your old previous_changes
are lost! The callback you expect to get called just doesn’t. I’ve had to debug that failure way too many times.
This blog post is a survey of SQL:2011 Temporal features from MariaDB, IBM DB2, Oracle, and MS SQL Server. I’m working on adding temporal features to Postgres, so I wanted to see how other systems interpret the standard.
If you’re new to temporal databases, you also might enjoy this talk I gave at PGCon 2019.
In this post I cover both application-time (aka valid-time) and system-time, but I focus more on valid-time. Valid-time tracks the history of the thing “out there”, e.g. when a house was remodeled, when an employee got a raise, etc. System-time tracks the history of when you changed the database. In general system-time is more widely available, both as native SQL:2011 features and as extensions/plugins/etc., but is less interesting. It is great for compliance/auditing, but you’re unlikely to build application-level features on it. Also since it’s generated automatically you don’t need special DML commands for it, and it is less important to protect yourself with temporal primary and foreign keys.
At this point all the major systems I survey have some temporal support, although none of them support it completely. On top of that the standard itself is quite modest, although in some ways it can be interpreted more or less expansively.
I’ll start by giving a quick overview of the standard. Here I’m working from the draft documents (downloaded from here), and my interpretation may not be correct. If you have any corrections please let me know! Also you can find a more complete description of the standard at this article by Kulkarni and Michels (pdf).
In SQL:2011 the gateway to temporal features is a PERIOD
, which is something you declare on your table. It is a range-like structure derived from two existing date
columns. (Actually the standard also supports timestamp
and timestamp with time zone
, but I’ll use date
as a synecdoche throughout this post.)
You can declare a valid-time PERIOD
when you create the table or afterwards:
CREATE TABLE t (
id INTEGER,
valid_from DATE,
valid_til DATE,
PERIOD FOR valid_at (valid_from, valid_til)
);
You can call the PERIOD
whatever you like except SYSTEM_TIME
, which is magical and enables system-time features. Both of the PERIOD
‘s source columns must be NOT NULL
, and if not they are automatically converted to it. (Most databases do the same thing with a PRIMARY KEY
.) Note that the NOT NULL
requirement means to represent “forever” or “until further notice” you must use a sentinel value like 3000-01-01
.
Naturally a PERIOD
adds an implicit constraint that valid_from
must be less than valid_til
.
You can also define a SYSTEM_TIME
period and ask the database to track changes for you:
CREATE TABLE t (
id INTEGER,
sys_from TIMESTAMP GENERATED ALWAYS AS ROW START,
sys_til TIMESTAMP GENERATED ALWAYS AS ROW END,
PERIOD FOR system_time (valid_from, valid_til)
) WITH SYSTEM VERSIONING;
Technically the standard lets you use DATE
columns for system-time periods, but it’s hard to imagine how that would work in practice. Really anything short of the RDBMS’s finest granularity could “squeeze out” some history.
If you have a valid-time PERIOD
then you can declare a temporal primary key when you create the table:
CREATE TABLE t (
id INTEGER,
valid_from DATE,
valid_til DATE,
PERIOD FOR valid_at (valid_from, valid_til),
CONSTRAINT tpk_t PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
A temporal primary key is a lot like a normal primary key, except the scalar part (here just id
) does not have to be unique, as long as rows with the same key don’t overlap in time. In other words you can give product 5 one price today and another tomorrow, and there’s no contradiction. But if you have two rows with the same scalar key covering the same date, that’s a violation of temporal entity integrity.
Temporal referential integrity is like ordinary referential integrity, except the non-unique nature of temporal primary keys makes it trickier. In a temporal foreign key, the child row’s lifespan must be completely “covered” by one (or more!) rows in the parent table. In other words some parent record must exist for every moment the child record exists. You can declare a temporal foreign key between two tables that both have PERIOD
s, e.g.:
CREATE TABLE ch (
id INTEGER,
valid_from DATE,
valid_til DATE,
t_id INTEGER,
PERIOD FOR valid_at (valid_from, valid_til),
CONSTRAINT tpk_ch PRIMARY KEY (id, valid_at WITHOUT OVERLAPS),
CONSTRAINT tfk_ch_to_t FOREIGN KEY (id, PERIOD valid_at)
REFERENCES t (id, PERIOD valid_at)
);
A PERIOD
is not included in the projection when you SELECT * FROM t
. It is questionable whether you can project it explicitly with SELECT *, valid_at FROM t
, but since it’s not a full-fledged data type I’d say probably not.
Also you can’t reference a PERIOD
in most other contexts, e.g. as a function input, or a GROUP BY
criterion, or when ORDER
ing, or joining. You can use it in a “period predicate”, which lets you test these period relationships:
Either side of the relationship can use a previously-named PERIOD
or an anonymous dynamically-constructed one, e.g.
x.valid_at OVERLAPS PERIOD(y.valid_from, y.valid_til)
It’s not clear to me where you can use a period predicate, although the standard groups it with other kinds of predicate under the <predicate>
object, so maybe anywhere you like? This browsable BNF grammer makes it easy to see that a <predicate>
can go anywhere that accepts a boolean expression, which can be used in a <search condition>
, which is what you put into your WHERE
clause, or a join’s ON
, or a CASE WHEN
, or lots of other places. If you have a firmer read of the standard here, let me know!
Also there is a special syntax for querying based on system-time. The standard doesn’t mention using it for valid-time, although you could imagine doing it:
SELECT * FROM t FOR SYSTEM_TIME AS OF t1
SELECT * FROM t FOR SYSTEM_TIME BETWEEN t1 AND t2
SELECT * FROM t FOR SYSTEM_TIME BETWEEN SYMMETRIC t1 AND t2
SELECT * FROM t FOR SYSTEM_TIME FROM t1 TO t2
If you ask for a limited time range, the stard/end columns do not get truncated to match your request. In other words, if you query FOR SYSTEM_TIME BETWEEN '2000-01-01' AND '2020-01-01'
, your result records’ sys_til
attributes are still 3000-01-01
(or whatever your sentinel is).
In UPDATE
and DELETE
commands you can restrict the timespan you want changed:
UPDATE t
FOR PORTION OF valid_at FROM t1 TO t2
SET ...
...
and
DELETE FROM t
FOR PORTION OF valid_at FROM t1 TO t2
...
These commands may require special transformations if they “hit” only part of an existing record. For example if you delete the middle of a longer timespan, then you need to replace the old big record with your new version plus two short records (one on each end). An update is the same: after changing the targeted portion, you’d have to insert new records to preserve each end of the original. The standard gives careful instructions here: the RDBMS should include these extra inserts within the “primary effect” of the operation.
There is no need for any special syntax for INSERT
, nor for special transformations.
The standard doesn’t have anything to say about a MERGE
statement (in Postgres ON CONFLICT DO UPDATE
), except in the case of system-time tables, where there is no new syntax and it does what you’d expect.
Since a PERIOD
is attached to a table and isn’t part of the relational model, it isn’t part of a result set. It gets lost when you query a table. That makes it hard to query non-table temporal data, like views, subqueries, CTEs, and set-returning functions. (This was a major criticism of the original TSQL2 proposal from the 90s.) Nonetheless I can imagine how SQL:2011 leaves open some workarounds, e.g. by letting you use anonymous PERIOD
s inside period predicates, and letting you use period predicates as widely as possible. Also you could argue that projecting a PERIOD
is unnecessary since you already have the start and end columns. So if an RDBMS gave you deep support for period predicates, composing temporal results would still be possible—albeit awkward. In practice though, no one does this, as we will see.
SQL:2011 also has no support for joining temporal results. You can effect an inner join with the OVERLAPS
operator, but not the other kinds.
Snodgrass suggested that temporal databases should “coalesce” results before presenting them or at least before saving them to a table. Coalescing means that when two rows have adjacent or overlapping timespans and all other attributs are identical, they get merged to become just one row. Duplicates are removed. This seems like good behavior, both for clarity and to avoid cutting up your data more and more finely as time goes on, but SQL:2011 doesn’t mention it.
There is also no explicit mention of how triggers combine with the new temporal DML operations.
MySQL doesn’t support any temporal features, but recent versions of MariaDB have started to add support. Version 10.3.4 (released Jan 2018) included system-time support; Version 10.4.3 (Feb 2019), valid-time.
MariaDB supports the normal syntax for declaring system-time tables, but you can also use this abbreviated syntax if you like:
CREATE TABLE t (
id INT
) WITH SYSTEM VERSIONING;
That will automatically add pseudo-columns named ROW_START
and ROW_END
(which also don’t appear in SELECT * FROM t
.
Or the standard syntax works too:
CREATE TABLE t (
id INTEGER,
sys_from TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
sys_til TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (valid_from, valid_til)
) WITH SYSTEM VERSIONING;
Either way, for a timestamp(6)
column (which is what the docs use) it looks like the max future date is 2038:
MariaDB [temporal]> insert into t (id) values (2);
Query OK, 1 row affected (0.008 sec)
MariaDB [temporal]> select * from t2;
+------+----------------------------+----------------------------+
| id | valid_from | valid_til |
+------+----------------------------+----------------------------+
| 2 | 2019-07-27 17:07:51.849190 | 2038-01-18 19:14:07.999999 |
+------+----------------------------+----------------------------+
1 row in set (0.004 sec)
That seems awfully soon to me.
You can use these three ways of asking for system-time filters:
SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-01';
SELECT * FROM t FOR SYSTEM_TIME FROM '2020-01-01' TO '2030-01-01';
SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-01' AND '2030-01-01';
MariaDB doesn’t know about BETWEEN SYMMETRIC
.
You can also say FOR SYSTEM_TIME ALL
, which is useful because the default (with no FOR SYSTEM_TIME
at all) is to filter AS OF NOW()
.
MariaDB partially addresses the composability problem by letting you say FOR SYSTEM_TIME
against a view, which “pushes down” the filter to the underlying tables. This even works if the view queries some non-system-time tables. Since every system-time PERIOD
is named the same thing, the database can sensibly interpret FOR SYSTEM_TIME
against your view.
To prevent tables getting too large, you can automatically partition a table by its SYSTEM_TIME
:
CREATE TABLE t (
id INT
) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME (
PARTITION p_hist HISTORY,
PARTITION p_curr CURRENT
);
That will keep current records in one partition and historical records in another. You can also have multiple historical partitions and ask the system to switch to the next one every n
rows. You can also drop older partitions to keep your data growth under control.
To further economize on disk, you can qualify specific columns as WITHOUT SYSTEM VERSIONING
to exclude them from history.
Declaring an application-time PERIOD
works, but you can’t include a temporal PRIMARY KEY
:
CREATE TABLE t (
id INTEGER,
valid_from DATE,
valid_til DATE,
PERIOD FOR valid_at (valid_from, valid_til),
-- This next line breaks!:
CONSTRAINT tpk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
Naturally you can’t create temporal foreign keys either.
If you omit the NOT NULL
for the PERIOD
source columns (as above), they become NOT NULL
automatically.
In UPDATE
and DELETE
statements you can use FOR PORTION OF valid_at
, per the standard. You can’t use an anonymous period:
UPDATE t
FOR PORTION OF PERIOD (valid_from, valid_til)
SET ...
(It’s hard to imagine why you’d want to though.)
You can’t SELECT
a period, named or anonymous:
SELECT * FROM t;
SELECT *, valid_at FROM t;
SELECT *, PERIOD (valid_from, valid_til) FROM t;
In a SELECT
you can’t use FOR valid_at
to filter things. That’s a little sad but perhaps understandable since arguably the standard only requires FOR SYSTEM_TIME
. But period predicates don’t work either. These were all errors for me:
SELECT * FROM t WHERE valid_at CONTAINS '2020-01-01';
SELECT * FROM t WHERE valid_at OVERLAPS PERIOD('2020-01-01', '2030-01-01');
So if you want to ask questions about your valid-time history, you need to query against the scalar date columns.
You can declare triggers on valid-time tables, and the triggers do fire for the extra inserts. Here is what I did to test things:
CREATE TABLE thist (
id INTEGER,
old_valid_from DATE,
old_valid_til DATE,
new_valid_from DATE,
new_valid_til DATE, op CHAR(1)
);
CREATE TRIGGER tins AFTER INSERT ON t
FOR EACH ROW
INSERT INTO thist VALUES
(NEW.id, NULL, NULL, NEW.valid_from, NEW.valid_til, 'i');
CREATE TRIGGER tupd AFTER UPDATE ON t
FOR EACH ROW
INSERT INTO thist VALUES
(NEW.id, OLD.valid_from, OLD.valid_til, NEW.valid_from, NEW.valid_til, 'u');
CREATE TRIGGER tdel AFTER DELETE ON t
FOR EACH ROW
INSERT INTO thist VALUES
(OLD.id, OLD.valid_from, OLD.valid_til, NULL, NULL, 'd');
If you UPDATE
in the middle of a larger record, you get two INSERT
s for the unaltered ends followed by an UPDATE
of the middle. (The INSERT
s come first.) The NEW.valid_from
and NEW.valid_til
mark the part that is being inserted/updated, as you’d expect.
If you DELETE
in the middle of a larger record, you also get two INSERTS
followed by a DELETE
of the part you touched. In the delete trigger the OLD.valid_{from,til}
columns have their actual old values, not the slice you’re deleting. (This probably makes sense, but it feels a little too mechanical/literal. It means your DELETE
trigger doesn’t know what slice of history you’re actually removing.)
You can also define bitemporal tables!
DB2 has the fullest temporal support of all the databases I examined. My tests used version 11.5.0.0 on Linux.
System-time works with a few syntax differences:
CREATE TABLE t (
id INTEGER NOT NULL PRIMARY KEY,
sys_from TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW BEGIN,
sys_til TIMESTAMP(12) NOT NULL GENERATED ALWAYS AS ROW END,
PERIOD SYSTEM_TIME (sys_from, sys_til)
);
You have to omit WITH SYSTEM VERSIONING
, and you have to explicitly make the period source columns NOT NULL
. Also you say GENERATED ALWAYS AS ROW BEGIN
not GENERATED ALWAYS AS ROW START
. Finally it is PERIOD SYSTEM_TIME
not PERIOD FOR SYSTEM_TIME
.
The sentinel for “forever” is 9999-12-30-00.00.00.000000000000
.
DB2 supports many valid-time features—but only if you name the period BUSINESS_TIME
. At IBM, it’s always business time! (I am shamelessly stealing this joke from my audience at PGCon 2019.)
Valid-time periods have the same syntax quirks as system-time.
You can define temporal primary keys!
According to the docs you can define temporal foreign keys, but I couldn’t make it work:
db2 => create table t2 (id integer not null, valid_from date not null, valid_til date not null, \
db2 (cont.) => t_id integer, period business_time (valid_from, valid_til), \
db2 (cont.) => constraint t2pk primary key (id, business_time without overlaps), \
db2 (cont.) => constraint tfk foreign key (t_id, period business_time) \
db2 (cont.) => references t (id, period business_time));
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0104N An unexpected token "business_time" was found following "gn key
(t_id, period". Expected tokens may include: "<space>". SQLSTATE=42601
Someome else can’t make it work either, according to this forum thread.
ALTER TABLE
failed for me too:
db2 => create table t2 (id integer not null, valid_from date not null, valid_til date not null, \
db2 (cont.) => t_id integer, period business_time (valid_from, valid_til), \
db2 (cont.) => constraint t2pk primary key (id, business_time without overlaps));
DB20000I The SQL command completed successfully.
db2 => alter table t2 add constraint tfk foreign key (t_id, period business_time) \
db2 (cont.) => references t (id, period business_time);
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0104N An unexpected token "business_time" was found following "gn key
(t_id, period". Expected tokens may include: "<space>". SQLSTATE=42601
If I learn a way to make it work, I’ll update this article.
As usual SELECT * FROM t
does not give you the period, and SELECT *, valid_at FROM t
is an error. Periods are not first-class types.
DB2 nicely interprets the standard generously and lets you use the system-time SELECT
syntax for application-time too:
SELECT * FROM t FOR business_time FROM t1 TO t2
SELECT * FROM t FOR business_time AS OF t1
but not:
SELECT * FROM t FOR business_time BETWEEN t1 AND t2
I couldn’t get any of the period predicates to work, e.g.:
db2 => select * from t where business_time contains '2015-01-01';
SQL0104N An unexpected token "contains" was found following "where
business_time". Expected tokens may include: "CONCAT". SQLSTATE=42601
I also couldn’t do anything creative with anonymous periods, e.g.:
db2 => select * from t for period(valid_from, valid_til) as of '2015-01-01';
SQL0104N An unexpected token "period" was found following "select * from t
for". Expected tokens may include: "<space>". SQLSTATE=42601
IBM doesn’t even care if you call it business_time
:
db2 => select * from t for period business_time(valid_from, valid_til) as of '2015-01-01';
SQL0104N An unexpected token "period business_time" was found following
"select * from t for". Expected tokens may include: "<space>".
SQLSTATE=42601
That means temporal features are going to break down when used with views, subqueries, CTEs, and set-returning functions. A period is tied to a table, but not a result set.
IBM DML is pretty standard. You can UPDATE
or DELETE
FOR PORTION OF BUSINESS_TIME FROM '2010-06-01' TO '2010-06-15'
. The extra INSERT
s happen as expected.
Like MariaDB, DB2 does call triggers for the derived INSERT
s. Here is some setup to add a row to thist
whenever a trigger gets called:
create table thist (id integer, old_valid_from date, old_valid_til date, new_valid_from date, new_valid_til date, op char(1));
create trigger tins after insert on t referencing new as new \
for each row insert into thist values \
(NEW.id, null, null, NEW.valid_from, NEW.valid_til, 'i');
create trigger tupd after update on t referencing old as old new as new \
for each row insert into thist values \
(NEW.id, OLD.valid_from, OLD.valid_til, NEW.valid_from, NEW.valid_til, 'u');
create trigger tdel after delete on t referencing old as old \
for each row insert into thist values \
(OLD.id, OLD.valid_from, OLD.valid_til, null, null, 'd');
If we UPDATE FOR PORTION OF
in the middle of a larger record, our INSERT
trigger is called twice:
db2 => update t \
db2 (cont.) => for portion of business_time \
db2 (cont.) => from '2015-01-01' to '2016-01-01' \
db2 (cont.) => set foo = 'bar';
DB20000I The SQL command completed successfully.
db2 => select * from t;
ID VALID_FROM VALID_TIL FOO
1 01/01/2015 01/01/2016 bar
1 01/01/2020 01/01/2030 -
1 01/01/2010 01/01/2015 -
1 01/01/2016 01/01/2020 -
4 record(s) selected.
db2 => select * from thist;
ID OLD_VALID_FROM OLD_VALID_TIL NEW_VALID_FROM NEW_VALID_TIL OP
1 - - 01/01/2010 01/01/2015 i
1 - - 01/01/2016 01/01/2020 i
1 01/01/2010 01/01/2020 01/01/2015 01/01/2016 u
3 record(s) selected.
Bitemporal works too!
For my tests I used Oracle 19c (version 19.3) for Linux and ran it on CentOS 7.
Oracle has its own way of tracking table history, so it doesn’t bother with SQL:2011 system-time.
Oracle lets you declare a PERIOD
, but like MariaDb you can’t define a temporal primary key:
CREATE TABLE t (
id INTEGER,
valid_from DATE,
valid_til DATE,
PERIOD FOR valid_at (valid_from, valid_til),
-- This next line breaks!:
CONSTRAINT tpk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
);
Of course that means no foreign keys either.
One interesting thing is that a PERIOD
doesn’t force your columns to NOT NULL
:
SQL> desc t;
Name Null? Type
ID NUMBER(38)
VALID_FROM DATE
VALID_TIL DATE
And that’s because nulls are allowed in PERIOD
-source columns:
SQL> insert into t values (6, null, null);
1 row created.
SQL> select * from t where id = 6;
ID VALID_FRO VALID_TIL
6
When you say SELECT * FROM t
you don’t get the period. You also can’t say this either, but in Oracle’s case it’s a parser error:
SELECT *, valid_at FROM t;
This doesn’t work either:
SELECT *, 1+1 FROM t;
But if you avoid the *
you can select it!:
SQL> SELECT id, valid_from, valid_til, valid_at FROM t;
ID VALID_FRO VALID_TIL VALID_AT
1 01-JAN-00 01-JAN-30 33426
2 01-JAN-10 01-JAN-30 33426
3 01-JAN-20 01-JAN-30 33426
4 01-JAN-00 01-JAN-10 33426
The result doesn’t mean much to me though. Anyone have any ideas?
Like in DB2 you are able to filter by a valid-time period, although the syntax is a little non-standard (and wordy):
SQL> SELECT * FROM t
2 AS OF PERIOD FOR valid_at DATE '2005-01-01';
ID VALID_FRO VALID_TIL
1 01-JAN-00 01-JAN-30
4 01-JAN-00 01-JAN-10
6
Incidentally, you can see here that NULL
in a period means “unbounded”. You can also make just one of the bounds NULL
, and AS OF
queries give the expected results. This is just like Postgres ranges! If Oracle does this for PERIOD
s, perhaps Postgres should too?
You can use BETWEEN
too, but its syntax is similarly garbled:
SQL> SELECT * FROM t
2 VERSIONS PERIOD FOR valid_at
3 BETWEEN DATE '2025-01-01' AND DATE '2035-01-01';
ID VALID_FRO VALID_TIL
2 01-JAN-10 01-JAN-30
1 01-JAN-00 01-JAN-30
3 01-JAN-20 01-JAN-30
6
Anonymous periods don’t seem to work though:
SQL> SELECT * FROM t
2 AS OF PERIOD FOR (valid_from, valid_til) DATE '2005-01-01';
AS OF PERIOD FOR (valid_from, valid_til) DATE '2005-01-01'
*
ERROR at line 2:
ORA-00904: : invalid identifier
You also can’t use standard period predicates:
SQL> SELECT * FROM t WHERE valid_at CONTAINS DATE '2015-01-01';
SELECT * FROM t WHERE valid_at CONTAINS DATE '2015-01-01'
*
ERROR at line 1:
ORA-00920: invalid relational operator
SQL> SELECT * FROM t WHERE valid_at OVERLAPS PERIOD('2015-01-01', '2020-01-01');
SELECT * FROM t WHERE valid_at OVERLAPS PERIOD('2015-01-01', '2020-01-01')
*
ERROR at line 1:
ORA-00920: invalid relational operator
Oracle doesn’t understand FOR PORTION OF
:
SQL> UPDATE t FOR PORTION OF valid_at
2 FROM DATE '2005-01-01' TO DATE '2006-01-01'
3 SET id = 8 WHERE id = 1;
UPDATE t FOR PORTION OF valid_at
*
ERROR at line 1:
ORA-00905: missing keyword
SQL> DELETE FROM t FOR PORTION OF valid_at
2 FROM DATE '2005-01-01' TO DATE '2006-01-01'
3 WHERE id = 1;
DELETE FROM t FOR PORTION OF valid_at
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
In Oracle you can define triggers on tables with a valid-time period, but without temporal DML there are no interesting questions about how they should behave. Nonetheless here are the same triggers as above but in Oracle syntax (in case I ever want to test this in the future):
CREATE TABLE thist (
id INTEGER,
old_valid_from DATE,
old_valid_til DATE,
new_valid_from DATE,
new_valid_til DATE, op CHAR(1)
);
CREATE TRIGGER tins AFTER INSERT ON t
FOR EACH ROW
BEGIN
INSERT INTO thist VALUES
(:NEW.id, NULL, NULL, :NEW.valid_from, :NEW.valid_til, 'i');
END;
/
CREATE TRIGGER tupd AFTER UPDATE ON t
REFERENCING OLD AS OLD NEW AS NEW
FOR EACH ROW
BEGIN
INSERT INTO thist VALUES
(:NEW.id, :OLD.valid_from, :OLD.valid_til, :NEW.valid_from, :NEW.valid_til, 'u');
END;
/
CREATE TRIGGER tdel AFTER DELETE ON t
REFERENCING OLD AS OLD
FOR EACH ROW
BEGIN
INSERT INTO thist VALUES
(:OLD.id, :OLD.valid_from, :OLD.valid_til, NULL, NULL, 'd');
END;
/
I tested an evaluation copy of MS SQL Server 2017 (version 14.0.1000.169, RTM
).
SQL Server doesn’t support application-time periods at all, just system-time.
The syntax for system-time tables is just a little non-standard:
CREATE TABLE dbo.t (
id INTEGER PRIMARY KEY,
valid_from datetime2 GENERATED ALWAYS AS ROW START,
valid_til datetime2 GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (valid_from, valid_til)
) WITH (
SYSTEM_VERSIONING = ON
);
Note the parens, the underscore, and the = ON
.
The history is stored in a separate invisible table with a generated name. But you can query that table like any other, so if you want to give it a nicer name you can:
WITH (
SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.thist)
);
The valid_til
sentinel will be 9999-12-31 23:59:59.9999999
.
To ask about a certain time you can say any of these:
SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-01';
SELECT * FROM t FOR SYSTEM_TIME BETWEEN '2020-01-01' AND '2030-01-01';
SELECT * FROM t FOR SYSTEM_TIME FROM '2020-01-01' TO '2030-01-01';
but not BETWEEN SYMMETRIC
.
So basically everyone has at least one kind of PERIOD
.
Everyone but Oracle has system-time (and they have another older approach).
The only database with temporal primary keys is DB2. They claim to have temporal foreign keys too, but I couldn’t make it work.
I was pleased that two databases let you select with FOR
and a valid-time period. No one lets you build anonymous periods (in FOR
, FOR PORTION OF
, or elsewhere), and no one supports period predicates.
With temporal DML, the extra inserts seem to be consistent (between MariaDB and DB2), and both databases fire triggers on them the same way.
I hope this helps the Postgres community work out their own temporal behavior with respect to the standard. I think it was an interesting study in its own right, too. One thing I learned is that “every other RDBMS supports SQL:2011” is only sort of true, at least as of today. :-)
Having been spoiled by the Rails Prime Stack for nearly a decade, in my React projects I prefer using Pug (née Jade) instead of JSX for its Haml-like syntax. A lot of people praise Pug and Haml for saving them typing, and while that’s nice, the real appeal to me is how easy they are to read. You spend a lot more time reading code than writing it, and Pug/Haml make the document structure immediately obvious. Making closing-tag errors obsolete is pretty nice, too.
With the babel-plugin-transform-react-pug
package, you can replace your JSX with something like this:
class MyApp extends React.Component {
render() {
return pug`
Provider(store=configureStore(this.props))
table
tbody
tr
td table
td layout
td anyone?
`;
}
}
But Pug is definitely not as widely adopted within React as Haml is within Rails, and it shows. I ran into a tricky issue using Pug to render FieldArray
s with Redux Form and React Bootstrap. To combine those two packages I’m basically following John Bennett’s advice, except with Pug.
Here is how typical scalar fields work:
const renderInput = ({input, label, type, meta}) => pug`
FormGroup(controlId=input.name validationState=${validState(meta)})
Col(componentClass=${ControlLabel} sm=2)
= label || humanizeString(input.name)
Col(sm=5)
FormControl(...input type=type)
`
class EditClientPage extends React.Component {
render() {
return pug`
Form(horizontal onSubmit=${this.props.handleSubmit})
fieldset
Field(name='name' component=renderInput type='text')
Field(name='paymentDue' component=renderInput type='number')
`
}
}
That’s from my time-tracking app where I wrote about using FieldArray
with redux-form-validators
. The Field
component is from Redux Form, and everything else is from React-Bootstrap (Form
, FormGroup
, Col
, ControlLabel
, and FormControl
). You can see that Field
expects a custom component to draw its details, giving you total flexibility how you structure your form’s DOM. Most of the Boostrap components go inside the custom component used by Field
.
So far that’s pretty nice, but if you have a FieldArray
you need more nesting. A FieldArray
is also part of Redux Form, and is used to draw a list of child records with their sub-fields. In my case I want the page to have one or more “work categories”, each with a name and hourly rate, e.g. “Design” and “Development”.
Like Field
, a FieldArray
delegates rendering to a custom component. Then that component will render the individual Field
s (each with their own custom component in turn). If you adapted the Redux Form docs’ example, you might try something like this:
const renderSimpleInput = ({input, placeholder, type, meta}) => pug`
span(class=${validClass(meta)})
FormControl(...input placeholder=placeholder type=type)
`
const renderWorkCategories = ({fields, meta}) => `pug
.noop
= fields.map((wc, index) => `pug
FormGroup(key=${index})
Col(sm=2)
Field(name=${`${wc}.name`} component=renderSimpleInput type='text' placeholder='name')
Col(sm=5)
Field(name=${`${wc}.rate`} component=renderSimpleInput type='number' placeholder='rate')
Col(sm=2)
a.btn.btn-default(onClick=${()=>fields.remove(index)}) remove
`)
FormGroup
Col(smOffset=2 sm=5)
a.btn.btn-default(onClick=${()=>fields.push({})}) add rate
`
class EditClientPage extends React.Component {
render() {
return pug`
...
FieldArray(name='workCategories' component=${renderWorkCategories})
...
`
}
}
The problem is that you can’t nest pug strings like that. I’m not sure if the problem is with the Babel transformer or the pug parser itself, but you get an error. Of course that’s not idiomatic Pug anyway, but surprisingly, you can’t use Pug’s each
command either:
const renderWorkCategories = ({fields, meta}) => `pug
.noop
each wc, index in fields
FormGroup(key=${index})
Col(sm=2)
Field(name=${`${wc}.name`} component=renderSimpleInput type='text' placeholder='name')
Col(sm=5)
Field(name=${`${wc}.rate`} component=renderSimpleInput type='number' placeholder='rate')
Col(sm=2)
a.btn.btn-default(onClick=${()=>fields.remove(index)}) remove
...
`
This gives you the error Expected "fields" to be an array because it was passed to each
. Apparently Redux Form is not using a normal array here, but its own special object.
The trick is to call getAll
, like this:
each wc, index in fields.getAll()
FormGroup(key=${index})
Col(sm=2)
Field(name=${`workCategories[${index}].name`} component=renderSimpleInput type='text' placeholder='name')
Col(sm=5)
Field(name=${`workCategories[${index}].rate`} component=renderSimpleInput type='number' placeholder='rate')
Col(sm=2)
a.btn.btn-default(onClick=${()=>fields.remove(index)}) remove
`
Note that we also had to stop using ${wc}
and are building the field name “by hand”. Personally I think we can stop here and be done, but if that feels like breaking encapsulation to you, or if you want something more generic that doesn’t need to “know” its FieldArray
name, there is another way to do it. Even if it’s a bit too much for a real project, it’s interesting enough that it’s maybe worth seeing.
To start, we need to call fields.map
with another custom component. This almost works:
const renderWorkCategory = (wc, index) => `pug
FormGroup(key=${index})
Col(sm=2)
Field(name=${`${wc}.name`} component=renderSimpleInput type='text' placeholder='name')
Col(sm=5)
Field(name=${`${wc}.rate`} component=renderSimpleInput type='number' placeholder='rate')
Col(sm=2)
a.btn.btn-default(onClick=${()=>fields.remove(index)}) remove
`
const renderWorkCategories = ({fields, meta}) => `pug
.noop
= fields.map(renderWorkCategory)
...
The only problem is the remove button: fields
is no longer in scope!
The solution is to use currying. The component we hand to fields.map
will be a partially-applied function, generated by passing in fields
early. ES6 syntax makes it really easy. The full code looks like this:
const renderWorkCategory = (fields) => (wc, index) => `pug
FormGroup(key=${index})
Col(sm=2)
Field(name=${`${wc}.name`} component=renderSimpleInput type='text' placeholder='name')
Col(sm=5)
Field(name=${`${wc}.rate`} component=renderSimpleInput type='number' placeholder='rate')
Col(sm=2)
a.btn.btn-default(onClick=${()=>fields.remove(index)}) remove
`
const renderWorkCategories = ({fields, meta}) => `pug
.noop
= fields.map(renderWorkCategory(fields))
...
`
You may recall we also used currying to combine redux-form-validators
with FieldArray
. It can really come in handy!
As I’ve said before, programming one thing is usually easy; it gets hard when we try to do many things at once. Here I show how to use Pug with a FieldArray
from Redux Form, on a page styled with React Bootstrap. I hope you found it useful if like me you are trying to have your cake and eat it too. :-)
UPDATE: It turns out I was making things too complicated: fields.map
takes an optional third parameter, fields
. That means there is no need to curry renderWorkCategory
and pass in fields
early. Instead of this:
const renderWorkCategory = (fields) => (wc, index) => ...
you can just say this:
const renderWorkCategory = (wc, index, fields) => ...
I guess it pays to read the documentation! :-)
Next: Validating FieldArrays in Redux Form