We have been rolling out CENTOS8 in our lower environments for testing, we use a dedicated vmware virtual server with centos8 minimal install, we only apply hardening techniques to these systems other than the main application, which is pg11 here. These systems use a LVM mounted ext4 filesystem for the data directory.
1 |
/dev/mapper/vg01-data1 /u02/data1 ext4 defaults, nofail 0 2 |
Recently on 3 of the new PG VMS after reboot we noticed that PG did not start, this also seemed intermittent, even though we have enabled the systemd service to start on reboots. So I checked the pg startup log and did not find too much about the issue. So I checked /var/log/messages and found the issue.
1 |
postgresql-11-check-db-dir[1038]: "/u02/data1/pg/data11" is missing or empty. |
I checked the systemd service file and saw that out of the box postgres had the following:
1 2 3 4 5 6 7 8 |
[Unit] Description=PostgreSQL 11 database server Documentation=https://www.postgresql.org/docs/11/static/ After=syslog.target After=network.target [Install] WantedBy=multi-user.target |
After=Syslog.target This is a special target unit in systemd and is the standardized name to pull in a syslog implementation.
After=network.target has very little meaning during start-up. It only indicates that the network management stack is up after it has been reached. Whether any network interfaces are already configured when it is reached is undefined.
WantedBy=multi-user.target normally defines a system state where all network services are started up and the system will accept logins, but a local GUI is not started. This is the typical default system state for server systems, which might be rack-mounted headless systems in a remote server room.
Those options above will not ensure that all filesystems in fstab are mounted before postgres starts. So what we were seeing was a classic race condition where postgres started before the data directory was mounted. As I previously mentioned we use a custom PGDATA location. So after some research I found my option that fixed this. You will need to edit the pg11 service and add the following, then reload systemd and reboot and all should work. You can find your LVM mount by running the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[root@server ~]# systemctl list-units --type=mount UNIT LOAD ACTIVE SUB DESCRIPTION -.mount loaded active mounted Root Mount boot-efi.mount loaded active mounted /boot/efi boot.mount loaded active mounted /boot dev-hugepages.mount loaded active mounted Huge Pages File System dev-mqueue.mount loaded active mounted POSIX Message Queue File System run-user-1328029883.mount loaded active mounted /run/user/1328029883 sys-fs-fuse-connections.mount loaded active mounted FUSE Control File System sys-kernel-config.mount loaded active mounted Kernel Configuration File System sys-kernel-debug.mount loaded active mounted Kernel Debug File System u02-data1.mount loaded active mounted /u02/data1 LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. 10 loaded units listed. Pass --all to see loaded but inactive units, too. To show all installed unit files use 'systemctl list-unit-files'. |
You can see my u02-data1.mount in the output, so edit and add the override file with the following, if you have multiple mounts, you can add them as well.
Edit with: systemctl edit postgresql-11
1 2 3 4 5 |
[Unit] After=local-fs.target u02-data1.mount [Service] Environment=PGDATA=/u02/data1/pg/data11 |
Reload the daemon with: systemctl daemon-reload
After=local-fs.target systemd-fstab-generator(3) automatically adds dependencies of type Before= to all mount units that refer to local mount points for this target unit. In addition, it adds dependencies of type Wants= to this target unit for those mounts listed in /etc/fstab that have the auto mount option set.