Temporal Ops

2024-07-17

One silver lining of temporal primary & foreign keys getting reverted is I got to meet Hettie Dombrovskaya and Boris Novikov.

I’ve been working with them to write SQL for various temporal operations not covered by the SQL:2011 standard. There is no support there for outer joins, semijoins, antijoins, aggregates, or set operations (UNION, INTERSECT, EXCEPT). As far as I know no one has ever shown how to implement those operations in SQL. I have queries so far for outer join, semijoin, and antijoin, and I’m planning to include aggregates based on this article by Boris. The set operations look pretty easy to me, so hopefully I’ll have those soon too.

If you’re interested, the repo is on Github.

Debugging the Sprinkler System

2024-07-14

Saturday I debugged the sprinklers.

I thought I turned them on two weeks ago, and I heard someone’s sprinklers outside my window that next Monday morning at 5 a.m., but after a week of 100-degree days it was clear ours weren’t doing their job. I had skipped my usual routine of checking each line, unearthing the sunken heads, and replacing what had failed. So now I had to deal with it.

Somehow after living here for ten years I still found two new heads I had never seen before. Here is a map I’ve kept for years, maybe since our first summer:

sprinkler map

It has every sprinkler head I’ve seen. Going by the rate I charge clients, that map is worth thousands of dollars.

In the bottom corner is the box where the water comes in from the street. There are more boxes where valves let water into each line.

One year I came across a buried water spigot in the middle of the grass. Then I lost it again.

But this was a valuable spigot. It was over by our raised beds, where there is no other convenient water. You have to drag a hose from across the yard to water there. In 2022 I borrowed a neighbor’s metal detector. I still couldn’t find it. Finally I tore up the grass with a shovel, probing what must have been a 20’ x 20’ area, until finally I heard a metal clink. I extended the pipe and topped it with a copper rabbit spigot I won as a kid at the Redlands Garden Show for a potted cactus garden. I’ve carried that rabbit with me for 35 years, waiting for a chance to use it.

rabbit spigot

That was two years ago. It’s on my map.

So why is our grass dying?

Naturally I run our sprinklers off a raspberry pi. I set it up years ago, back in 2016. The controller that came with the house was dying. Two-thirds of the time when I tried to water line 12 or 13, line 4 or 5 would turn on instead. (Yes, we have 13 sprinkler lines. It’s a big yard.) Almost always it was off by 8, or sometimes 4: pretty clearly some loose wires. Why spend fifty bucks to replace it when I could spend days building my own? Look, at least there is no Kubernetes or CI pipeline, okay?

There were raspi sprinkler products you could buy, and I think I saw an open source project, but that didn’t seem like fun. I wanted control and flexbility. I wanted power. I wanted Raspbian, Python, and cron.

Here is my script, called sprinkle:

#!/usr/bin/env python

# sprinkle - Raspberry Pi sprinkler controller

import time
import RPi.GPIO as GPIO
import sys, signal

# Your sprinkler lines:
# Your sprinkler line 1 goes in array position 0,
# then sprinkler line 2 goes in array position 1,
# etc.
# Each value is the Raspi GPIO pin
# you will connect to that line.
# So if you say
#   sprinkler_lines = [6, 19]
# then you should connect pin 6 to sprinker line 1,
# and pin 19 to sprinker line 2.
# sprinkler_lines = [21, 20, 16, 12, 25, 24, 23, 26, 19, 13, 6, 5, 22]
sprinkler_lines = [23, 24, 25, 16, 12, 20, 21, 22, 5, 6, 13, 19, 26]

def usage(err_code):
    print("USAGE: sprinkle.py <sprinkler_line> <number_of_minutes>")
    sys.exit(err_code)

def int_or_usage(str):
    try:
        return int(str)
    except ValueError:
        usage(1)

if len(sys.argv) != 3:
    usage(1)

sprinkler_line    = int_or_usage(sys.argv[1])
number_of_minutes = int_or_usage(sys.argv[2])

if sprinkler_line < 1 or sprinkler_line > len(sprinkler_lines):
    print("I only know about sprinkler lines 1 to %d." % len(sprinkler_lines))
    sys.exit(1)

if number_of_minutes < 1 or number_of_minutes > 30:
    print("I don't want to run the sprinklers for %d minutes." % number_of_minutes)
    sys.exit(1)


def exit_gracefully(signal, frame):
    GPIO.cleanup()
    sys.exit(0)
signal.signal(signal.SIGINT, exit_gracefully)


active_pin = sprinkler_lines[sprinkler_line - 1]
GPIO.setmode(GPIO.BCM)
for pin in sprinkler_lines:
    GPIO.setup(pin, GPIO.OUT)
    GPIO.output(pin, False)
GPIO.output(active_pin, True)
time.sleep(60 * number_of_minutes)
GPIO.output(active_pin, False)
exit_gracefully(None, None)

That is a lot of code but it turns on one GPIO pin, sleeps a while, then turns it off. Near the top you can see an array that maps sprinkler lines to GPIO pins. I kept the old sprinkler numbering, so it matches the notes the old owners left us. Array position n means sprinker line n+1.

Then I have a higher-level script I run each morning out of cron, which does the front on even days and the back on odd. It logs when it starts and finishes, which has helped me a lot:

#!/usr/bin/env python

# do-yard - Run sprinklers for the whole yard.
# We do the front yard on even days and the back yard on odd days.

import time
from subprocess import call

t = time.localtime()
if t.tm_yday % 2:
    print("%s: Starting the back" % time.strftime("%Y-%m-%d %H:%M:%S", t))
    # odd days we do the back yard:
    for line in [4, 5, 6, 7, 8, 12]:
        call(["/home/pi/sprinkle", str(line), "5"])
        time.sleep(60)
    print("%s: Finished the back" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))

else:
    print("%s: Starting the front" % time.strftime("%Y-%m-%d %H:%M:%S", t))
    # even days we do the front yard (and a little bit of the back):
    for line in [1, 2, 3, 9, 10, 11, 13]:
        call(["/home/pi/sprinkle", str(line), "5"])
        time.sleep(60)
    print("%s: Finished the front" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))

The hard part was figuring out the wiring. I’ve never gone much further than Ohm’s Law. For a long time I was stuck working out how to drive the sprinkler valves. Sprinkler valves use a solenoid to open and shut. In my garage, 13 colored wires come out of the ground, along with one neutral white wire to complete the circuit. Then plugged into the wall is an adapter to produce 24 volt AC, and two wires come out of that. In between used to be the old controller. It would send 24 VAC down whichever wire matched the spinkler line (& ~(1 << 3)).

The pi outputs 3.3 volts DC. At first I thought there was an integrated circuit that could convert the signal for me, but eventually I resigned myself to using a bank of relays:

raspi sprinklers

Oh also I never learned how to solder.

A relay is a mechanical system. The AC power goes through, but it’s blocked by an open switch. The DC power is on another circuit, and it activates an electromagnet that closes the switch. When you turn on the signal, you see a red light, and the switch closing makes a loud click.

A bank of 16 relays cost $12, almost as much as a sprinkler controller, so I really wanted my ICs to work out. Oh well.

So today I started with checking the log. Well no, because the pi wasn’t responding to ssh again.

It has always been tempermental. After a few hours the wifi dies, sometimes sooner. Pulling the plug for a moment fixes it, but then you have to wait while it boots. So I have to bring a laptop down to the garage, even just to check on things. Today I thought I would finally fix that.

Other people have the same problem. One reported culprit is power-saving mode. I checked and mine was running that way:

pi@raspberrypi:~ $ iw dev wlan0 get power_save
Power save: on

The nicest advice I found was to disable it at boot with systemd. Just run this:

sudo systemctl --full --force edit wifi_powersave@.service

and in your editor enter—ugh, nano? That had to be fixed.

Setting EDITOR in root’s ~/.profile should do it.

No? ~/.bashrc then?

Still no? Back to Stack Overflow… .

No clues. I guess I’m on my own.

What is this .selected_editor file in root’s home directory? Hmm, it already says vim.

Is sudo even launching its command through a shell? Probably not, once I think of it. If it just execs the command directly, no wonder ~/.profile does nothing.

More Stack Overflow. Most questions are about visudo, and I see something called sudoedit, and people are asking how to control which editor that launches. (Why not just run the editor you want? The man page says it lets you keep your own editor configuration. Like my own ~/.vimrc? That’s cool. Really? How does that work?) But in my case the editor is getting launched by systemd. Surely we would have all been happier if we’d just gone with runit?

Does root have $SYSTEMD_EDITOR set? Surely not—no, too bad.

Of course I could just edit the file myself, but it’s the principle of the thing.

Okay, I give up:

sudo visudo -f /etc/sudoers.d/20_editor

I typed this:

Defaults env_keep += "editor EDITOR"

So now when I run sudo, it will pass along my own $EDITOR choice.

Is this a security hole? I can imagine some possible issues on a server, but for the pi in my garage it seems okay.

Now systemd launches vim! Shamelessly I copy and pasted:

[Unit]
Description=Set WiFi power save %i
After=sys-subsystem-net-devices-wlan0.device

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/iw dev wlan0 set power_save %i

[Install]
WantedBy=sys-subsystem-net-devices-wlan0.device

I’ve never seen this %i thing before. The idea is it lets you do this:

sudo systemctl disable wifi_powersave@off.service
sudo systemctl enable wifi_powersave@on.service

or this:

sudo systemctl disable wifi_powersave@on.service
sudo systemctl enable wifi_powersave@off.service

That’s cool.

Oh, better not forget to run it now too:

sudo iw dev wlan0 set power_save off

So I turned off power saving. Maybe that will fix the wifi.

Let’s check the log file. Have the sprinklers been running?:

2024-06-30 06:00:01: Starting the front
2024-06-30 06:42:03: Finished the front
2024-07-01 06:00:01: Starting the back
2024-07-01 06:36:03: Finished the back
2024-07-02 06:00:01: Starting the front
2024-07-02 06:42:02: Finished the front
2024-07-03 06:00:01: Starting the back
2024-07-03 06:36:02: Finished the back
2024-07-04 06:00:01: Starting the front
2024-07-04 06:42:03: Finished the front
2024-07-05 06:00:01: Starting the back
2024-07-05 06:36:03: Finished the back
2024-07-06 06:00:01: Starting the front
2024-07-06 06:42:03: Finished the front
2024-07-07 06:00:01: Starting the back
2024-07-07 06:36:02: Finished the back
2024-07-08 06:00:01: Starting the front
2024-07-08 06:42:03: Finished the front
2024-07-09 06:00:01: Starting the back
2024-07-09 06:36:03: Finished the back
2024-07-10 06:00:01: Starting the front
2024-07-10 06:42:02: Finished the front
2024-07-11 06:00:02: Starting the back
2024-07-11 06:36:03: Finished the back
2024-07-12 06:00:01: Starting the front
2024-07-12 06:42:03: Finished the front
2024-07-13 06:00:01: Starting the back
2024-07-13 06:36:03: Finished the back

They’ve been running all along! 40 minutes for the front, 30 for the back.

But clearly they’re doing nothing. The pi is turning on a pin then just sitting there.

So there must be a loose connection.

I tried line 3: ./sprinkle 3 10. No red light, no click. Line 10. No red light, no click. Line 2. No red light, no click.

I went upstairs to fetch my multimeter. Time to test connectivity and voltage.

multimeter

How in the world did I wire this thing anyway?

Then I noticed a couple red wire loops, connecting GPIO pins to the breadboard power rail, but detached now from the power rail. The pins both said 5V. (That tiny text was easier to read in 2016.) So those came loose? What if I put them back in again? I think I remember . . . wasn’t this supposed to power the relay?

Trying my sprinkle command again made the light come on! I must have missed the click though. Were the sprinklers running? No? What if I try a few lines? I’m really not hearing the click. But the light is on.

relay with red light

How does each relay work again? I set the multimeter to connectivity to probe each pair of posts. They were more connected than I expected. Was that bad? Okay I remember the white neutral wire running from one relay to another in series. And the colored wires go out and into the ground, one per relay.

I remember something about those two little red wire loops. They really looked disconnected on purpose. They weren’t just loose, they were completely out of the breadboard.

Is anything else loose? A bit, but when I fix it nothing changes.

I remember those two red wires. They are supposed to give 5 volts to power the relay, but it never worked did it? It was supposed to, but it didn’t. Like the pi just didn’t have enough oomph. Or was the board supposed to power the pi?

What are these other two thin black wires leaving the relay board? Where do they go? Off to the right, oh, to a power adapter! Two weeks ago I plugged in the adapter for the pi, and I plugged in the 24 VAC adapter, but the relays need power too, and they get it from the power strip over by the garage freezer.

I guess this is why phone support asks if you’ve plugged it in.

sprinklers running

Temporal Reverted

2024-07-05

My work adding temporal primary keys and foreign keys to Postgres was reverted from v17. The problem is empty ranges (and multiranges). An empty range doesn’t overlap anything, including another empty range. So 'empty' && 'empty' is false. But temporal PKs are essentially an exclusion constraint using (id WITH =, valid_at WITH &&). Therefore you can insert duplicates, as long as the range is empty:

INSERT INTO t (id, valid_at, name) VALUES (5, 'empty', 'foo');
INSERT INTO t (id, valid_at, name) VALUES (5, 'empty', 'bar');

That might be okay for some users, but it surely breaks expectations for others. And it’s a questionable thing to do that we should probably just forbid. The SQL standard forbids empty PERIODs, so we should make sure that using plain ranges does the same. Adding a record with an empty application time doesn’t really have a meaning in the temporal model.

I think this is a pretty small bump in the road. At the Postgres developers conference we found a good solution to excluding empty ranges. My original attempt used CHECK constraints, but that had a lot of complications. Forbidding them in the executor is a lot simpler. I’ve already sent in a new set of patches for v18 that implement that change.

k3s Behind Nginx

2024-04-06

Here on illuminatedcomputing.com I’ve got a bunch of sites served by nginx, but I’d like to run a little k3s cluster as well. The main benefit would be isolation. That is always helpful, but it especially matters for staging sites for some customers who don’t update very often.

Instead of migrating everything all at once, I want to keep my host nginx but let it reverse proxy to k3s for sites running there. Then I will block direct traffic to k3s, so that there is only one way to get there. I realize this is not really a “correct” way to do k8s, but for a tiny setup like mine it makes sense. Maybe I should have just bought a separate box for k3s, but I find pushing tools a bit like this is a good way to learn how they really work, and that’s what happened here.

It was harder than I thought. I found one or two people online seeking to do the same thing, but there were no good answers. I had to figure it out on my own, and now maybe this post will help someone else.

The first step was to run k3s on other ports. I’m using the ingress-nginx ingress controller via a Helm chart. In my values.yaml I have it bind to 8080 and 8443 instead:

ingress-nginx:
  controller:
    enableHttp: true
    enableHttps: true
    service:
      ports:
        http: 8080
        https: 8443

Then I can see the Service is using those ports:

paul@tal:~/src/illuminatedcomputing/k8s$ k get services -A
NAMESPACE         NAME                                         TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                         AGE
ingress           ingress-ingress-nginx-controller             LoadBalancer   10.43.91.109    107.150.34.82   8080:31333/TCP,8443:30702/TCP   7d20h
...

Setting up nginx to reverse proxy was also no problem. For example here is a private docker registry I’m running:

server {
  listen 443 ssl;
  server_name docker.illuminatedcomputing.com;

  ssl_certificate ssl/docker.illuminatedcomputing.com.crt;
  ssl_certificate_key ssl/docker.illuminatedcomputing.com.key;

  location / {
    proxy_pass https://127.0.0.1:8443;
    proxy_set_header Host "docker.illuminatedcomputing.com";
  }
}

server {
  listen 80;
  server_name docker.illuminatedcomputing.com;

  location / {
    proxy_pass http://127.0.0.1:8080;
    proxy_set_header Host "docker.illuminatedcomputing.com";
  }
}

The only tricky part is the ssl cert. I already had the cluster built to get certs from LetsEncrypt with cert-manager. So I have a little cron script that pulls out the k8s Secret and puts it where the host nginx can find it:

#!/bin/bash

exec > >(tee /var/log/update-k3s-ssl-certs.log) 2>&1

echo "$(date -Iseconds) starting"

set -eu

# Everything running in k8s needs to be proxied by nginx,
# so pull the ssl certs and drop them where nginx can find them.
# Do this every day so that we pick up LetsEncrypt renewals.

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml

# docker.illuminatedcomputing.com
kubectl get secret -n docker-registry docker-registry-tls -o json | jq -r '.data["tls.crt"] | @base64d' > /etc/nginx/ssl/docker.illuminatedcomputing.com.crt
kubectl get secret -n docker-registry docker-registry-tls -o json | jq -r '.data["tls.key"] | @base64d' > /etc/nginx/ssl/docker.illuminatedcomputing.com.key

# need to reload nginx to see new certs
systemctl reload nginx

echo "$(date -Iseconds) finished"

Probably it would be easier to run certbot on the host and push the cert into k8s (or just terminate TLS), but using cert-manager is what I’d do for a customer, and I’m hopeful that eventually I’ll drop the reverse proxy altogether.

So at this point connecting works:

curl -v https://docker.illuminatedcomputing.com/v2/_catalog

(Of course it will be a 401 without the credentials, but you are still getting through to the service.)

The problem is that this works too:

curl -v https://docker.illuminatedcomputing.com:8443/v2/_catalog

So how can I block that port from everything but the host nginx? I tried making the controller bind to just 127.0.0.1, e.g. with this config:

ingress-nginx:
  controller:
    config:
      bind-address: "127.0.0.1"
    enableHttp: true
    enableHttps: true
    service:
      externalIPs:
        - "127.0.0.1"
      ports:
        http: 8080
        https: 8443

The bind-address line adds to a ConfigMap used to generate the nginx.conf. It doesn’t work though. The 127.0.0.1 is from the perspective of the controller pod, not the host 127.0.0.1.

Using externalIPs (with or without bind-address) also fails. When I add those two lines k3s gives this error:

Error: UPGRADE FAILED: cannot patch "ingress-ingress-nginx-controller" with kind Service: Service "ingress-ingress-nginx-controller" is invalid: spec.externalIPs[0]: Invalid value: "127.0.0.1": may not be in the loopback range (127.0.0.0/8, ::1/128)

So I gave up on that approach.

But what about using iptables to block 8443 and 8080 from the outside? That’s probably simpler anyway—although k3s adds a big pile of its own iptables rules, and diving into that was a bit intimidating.

The first thing I tried was putting a rule at the top of the INPUT chain. I tried all these:

iptables -I INPUT -p tcp \! -s 127.0.0.1 --dport 8443 -j DROP
iptables -I INPUT -p tcp \! -i lo --dport 8443 -j DROP
iptables -I INPUT -p tcp -i enp2s0 --dport 8443 -j DROP

But none of those worked. I could still get through.

At this point a friend asked ChatGPT for advice, but it wasn’t very helpful. It told me

Instead of having the ingress controller listen on an external IP or trying to make it listen only on 127.0.0.1, configure your host’s nginx to proxy_pass to your k3s services.

Yes, I had explained I was doing that. Also:

You could create a network policy that only allows traffic to the ingress-nginx pods from within the cluster itself.

But that will block the reverse proxy too.

So the cyber Pythia was not coming through for me. I was going to have to figure it out on my own. That meant coming to grips with all the rules k3s was installing.

I started with adding some logging, for example:

iptables -I INPUT -p tcp -d 107.150.34.82 -j LOG --log-prefix '[PJPJPJ] '

Tailing /var/log/syslog, I could see messages for 443 requests, but nothing for 8443!

So I took a closer look at the nat table (which is processed before the filter table), and I found some relevant rules:

-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

-A KUBE-EXT-2ZARXDYICCJUF4UZ -m comment --comment "masquerade traffic for ingress/ingress-ingress-nginx-controller:https external destinations" -j KUBE-MARK-MASQ
-A KUBE-EXT-2ZARXDYICCJUF4UZ -j KUBE-SVC-2ZARXDYICCJUF4UZ
-A KUBE-EXT-DBDMS67BVV2C2LTP -m comment --comment "masquerade traffic for ingress/ingress-ingress-nginx-controller:http external destinations" -j KUBE-MARK-MASQ
-A KUBE-EXT-DBDMS67BVV2C2LTP -j KUBE-SVC-DBDMS67BVV2C2LTP

-A KUBE-SEP-RQCBIXXO7M53R2WC -s 10.42.0.42/32 -m comment --comment "ingress/ingress-ingress-nginx-controller:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-RQCBIXXO7M53R2WC -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:https" -m tcp -j DNAT --to-destination 10.42.0.42:443
-A KUBE-SEP-TXLMBMTNQTOOKDI3 -s 10.42.0.42/32 -m comment --comment "ingress/ingress-ingress-nginx-controller:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-TXLMBMTNQTOOKDI3 -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:http" -m tcp -j DNAT --to-destination 10.42.0.42:80

-A KUBE-SERVICES -d 107.150.34.82/32 -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:https loadbalancer IP" -m tcp --dport 8443 -j KUBE-EXT-2ZARXDYICCJUF4UZ
-A KUBE-SERVICES -d 107.150.34.82/32 -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:http loadbalancer IP" -m tcp --dport 8080 -j KUBE-EXT-DBDMS67BVV2C2LTP

-A KUBE-SVC-2ZARXDYICCJUF4UZ ! -s 10.42.0.0/16 -d 10.43.91.109/32 -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:https cluster IP" -m tcp --dport 8443 -j KUBE-MARK-MASQ
-A KUBE-SVC-2ZARXDYICCJUF4UZ -m comment --comment "ingress/ingress-ingress-nginx-controller:https -> 10.42.0.42:443" -j KUBE-SEP-RQCBIXXO7M53R2WC
-A KUBE-SVC-DBDMS67BVV2C2LTP ! -s 10.42.0.0/16 -d 10.43.91.109/32 -p tcp -m comment --comment "ingress/ingress-ingress-nginx-controller:http cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SVC-DBDMS67BVV2C2LTP -m comment --comment "ingress/ingress-ingress-nginx-controller:http -> 10.42.0.42:80" -j KUBE-SEP-TXLMBMTNQTOOKDI3

If you follow how that bounces around, it eventually gets rerouted to 10.42.0.42, either :443 or :80. So that’s why a connection to 8443 never hits the INPUT chain.

So the solution was to drop the traffic in the nat table instead:

root@www:~# iptables -I PREROUTING -t nat -p tcp -i enp2s0 --dport 8443 -j DROP
iptables v1.8.4 (legacy):
The "nat" table is not intended for filtering, the use of DROP is therefore inhibited.

Oops, just kidding!

But instead I can just tell 8080 & 8443 to skip all the k3s rewriting:

iptables -I PREROUTING -t nat -p tcp -i enp2s0 --dport 8443 -j RETURN
iptables -I PREROUTING -t nat -p tcp -i enp2s0 --dport 8080 -j RETURN

Now those do show up on the INPUT chain, but I don’t even need to DROP them there. There is nothing actually listening on those ports. The controller is still binding to 443 and 80, and k3s is using iptables trickery to reroute connections to those ports. So those two lines above are sufficient, and someone connecting directly gets a Connection refused.

To make this run each time the machine boots, I wrote a script at /usr/local/bin/iptables-custom.sh:

#!/bin/bash

# Installs some rules to prevent 8443 and 8080 from getting routed to k8s from the outside world,
# so that you must access them via our nginx reverse proxy.

(iptables -L -n -t nat | grep '^RETURN.*8443$' >/dev/null) || iptables -t nat -I PREROUTING -p tcp -i enp2s0 --dport 8443 -j RETURN
(iptables -L -n -t nat | grep '^RETURN.*8080$' >/dev/null) || iptables -t nat -I PREROUTING -p tcp -i enp2s0 --dport 8080 -j RETURN

Then I put this unit file at /etc/systemd/system/iptables-custom.service:

[Unit]
Description=adds custom iptables rules after k3s has started
After=k3s.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/iptables-custom.sh

[Install]
WantedBy=default.target

Then I ran systemctl daemon-reload and systemctl enable iptables-custom.

That’s it! I hope this is helpful or you at least enjoyed the story.

Cozy Toes

2024-03-03

The kids and I love to play games on the weekend. Our My favorite is Agricola, a board game about farming in the Middle Ages. We also play a lot of matching games. Saturday morning Elsa invented her own matching game using the cards from Abandon All Artichokes. Then I made up a matching game too. I called it “Cozy Toes”.

cozy toes

It was a money game. Whoever got the most pairs of matching socks won, and everyone else owed that person five cents for each pair they were short.

I explained how you had to compare each sock to each other sock, which meant the work to find one sock’s match was as big as all the other socks. The work for all the socks—say there were n of them—was n times n. So if you let the pile get too big, you have a lot of work to do.

I suppose with several workers—say m of them—the work was less, but I don’t know for sure what it was. In practice it is a hard job for many workers to share. In our own game there was a lot of contention.

Later when they thought they had matched all the socks they could, and only odd singles remained, I asked how they could be sure. They decided to sort the socks from longest to shortest. Then they could see that there were no more matches. But they still compared every sock to every other.

unmatched socks

If we keep playing this game maybe I will teach them how to bubble sort.

Temporal PKs Merged!

2024-01-24

UPDATE: My temporal patches were reverted from v17. Hopefully they will be accepted for v18 instead.

Today first thing in the morning I saw that the first part of my temporal tables work for Postgres got merged. It was two patches actually: a little one to add a new GiST support function and then the main patch adding support for temporal primary keys and unique constraints based on range types. The support for SQL:2011 PERIODs comes later; for now you must use ranges—although in my opinion that is better anyway. Also this patch allows multiranges or, keeping with Postgres’s long history of extensibility, any type with an overlaps operator. So unless some big problem appears, PKs and UNIQUE constraints are on track to be released in Postgres 17.

Probably I can get (basic) foreign keys into v17 too. Temporal update/delete, foreign keys with CASCADE, and PERIODs will more likely take ’til 18.

If you are interested in temporal features, early testing is always appreciated! :-)

Getting this into Postgres has been a ten-year journey, and the rest of this post is going to be a self-indulgent history of that work. You’ve been warned. :-)

It started in 2013 when I kept noticing my clients needed a better way to track the history of things that change over time, and I discovered Richard Snodgrass’s book Developing Time-Oriented Database Applications in SQL. He offered a rigorous, systematic approach, with working SQL solutions for everything. This was exactly what I needed. His approach was vastly better than the ad hoc history-tracking I’d seen so far. But no one had implemented any of it!

My first Postgres patch in 2015 was motivated by temporal databases: I added UUID support to the btree_gist extension. A temporal primary key is basically an exclusion constraint on (id WITH =, valid_at WITH &&), and I had a project with UUID ids. But that exclusion constraint requires a GiST index that knows how to perform equal comparisons against the id column and overlap comparisons against the valid_at column. Out-the-box GiST indexes can’t do that (unless your ids are something weird like range types). If your ids are integers, you can install btree_gist to create a GiST opclass that knows what integer = means, but at the time UUIDs were not supported. So I started there. I liked that temporal databases had a manageable feature set and a manageable body of literature, so that even a working programmer like me could break new ground (not like Machine Learning or even Time Series databases). Nonetheless that patch took a year and a half to get committed, and it was really other people like Chris Bandy who finished it.

I kept reading about temporal databases, and in 2017 I wrote a proof-of-concept for temporal foreign keys, mostly at AWS Re:Invent. I happened to be given a free registration & hotel room, but it was too late to register for any of the good talks. But all that time with nothing to do was fantastically productive, and I remember by the flight home I was adding tons of tests, trying to cover every feature permutation—ha, as if. A few days after I returned I also published my annotated bibliography, which I’ve updated many times since.

In Snodgrass a temporal foreign key is a page-and-a-half of SQL, mostly because a referencing row may need more than one referenced row to completely cover its time span. But I realized we could make the check much simpler if we used an aggregate function to combine all the relevant rows in the referenced table first. So I wrote range_agg, first as an extension, then as a core patch. Jeff Davis (who laid the foundation for temporal support with range types and exclusion constraints) said my function was too narrow and pushed me to implement multiranges, a huge improvement. Again it took a year and a half, and I had trouble making consistent progress. There was a lot of work at the end by Alvaro Herrera and Alexander Korotkov (and I’m sure others) to get it committed. That was a few days before Christmas 2020.

Although the Postgres review process can take a long time, I cherish how it pushes me to do better. As a consultant/freelancer I encounter codebases of, hmm, varying quality, and Postgres gives me an example of what high standards look like.

One thing I still remember from reading Programmers at Work many years ago was how many inteviewees said they tried to build things at a higher level of abstraction than they thought they’d need. I’ve seen enough over-engineered tangles and inner-platform effects that my own bias is much closer to YAGNI and keeping things concrete, but the advice in those interviews still prods me to discover good abstractions. The Postgres codebase is full of things like that, and really it’s such a huge project that strong organizing ideas are essential. Multiranges was a great example of how to take a concrete need and convert it into something more general-purpose. And I thought I was doing that already with range_agg! I think one thing that makes an abstraction good is a kind of definiteness, something opinionated. So it is not purely general, but really adds something new. It always requires an act of creation.

The coolest thing I’ve heard of someone doing with multiranges was using them in astronomy to search for neutrinos, gravitational waves, and gamma-ray bursts. By using multiranges, they were able to compare observations with maps of the night sky “orders of magnitude faster” than with other implementations. (Hopefully I’ve got that right: I read a pre-print of the paper but it was not all easy for me to understand!)

My first patch for an actual temporal feature was primary keys back in 2018. Then foreign keys followed in 2019, just a couple weeks before I gave a talk at PgCon about temporal databases. By the end of the year I had FOR PORTION OF as well. At first FOR PORTION OF was implemented in the Executor Phase, but when I gave a progress report for PgCon 2020 I was already working on a trigger-based reimplementation, though it wasn’t submitted until June 2021. I also pulled in work by Vik Fearing from 2018 to support ADD/DROP PERIOD.

Soon after that progress got harder: my wife and I had our sixth baby in August, and somehow he seemed to be more work than the others. I took over daily math lessons (we homeschool), and I had to let go my biggest client, who needed more hours than I could give. (I’m proud to have given them an orderly transition over several months though.) In January 2022 Peter Eisentraut gave me a thorough review, but I went silent. Still, I had a lot of encouragement from the community, especially Corey Huinker, and eventually doing Postgres got easier again. I had a talk accepted for PgCon 2023, and I worked hard to submit new patches, which I did only weeks before the conference.

The best part of PgCon was getting everyone who cared about my work together in the hallway to agree on the overall approach. I had worried for years about using ranges as well as PERIODs, since the standard doesn’t know anything about ranges. The second-best part was when someone told me I should stop calling myself a Postgres newbie.

At PgCon Peter asked me to re-organize the patches, essentially implementing PERIODs as GENERATED range columns. It made the code much nicer. I also went back to an Executor Phase approach for FOR PORTION OF. Using triggers had some problems around updateable views and READ COMMITTED transaction isolation.

Since May I’ve felt more consistent than during my other Postgres work. I’ve been kept busy by excellent feedback by a meticulous reviewer, Jian He, who has caught many bugs. Often as soon as I get caught up, before I’ve written the email with the new patch files, he finds more things!

Another thing that’s helped is going out once a week (for nearly a year now) to get early dinner then work on Postgres at a local bar. Somehow it’s much easier to do Postgres from somewhere besides my home office, where I do all my normal work. Getting dinner lets me read something related (lately Designing Data-Intensive Applications by Martin Klepmann and PostgreSQL 14 Internals by Egor Rogov), and it’s fun. Doing just a little every week helps me keep momentum, so that fitting in further progress here and there seems easy. I’m lucky to have a wife who has supported it so often, despite leaving her with the kids and dishes.

I think I have years more work of temporal features to add, first finishing SQL:2011 then going beyond (e.g. temporal outer joins, temporal aggregates, temporal upsert). It’s been a great pleasure!

Next: Git for Postgres Hacking