OpenVPN systemd CapabilityBoundingSet breaking notifications with exim4


At work we employ a openvpn server when working remotely and wanting to access firewall restricted locations. At some point a colleague of mine started facing disconnects to the server. We tracked down the issue being the feature of protecting against SWEET32 attacks, introduced in openvpn client version 2.4. We thus decided to upgrade our openvpn server too and bring version 2.4 from jessie-backports.

When a client successfully connects to the VPN server a script is executed and sends email notifications to the LDAP user’s email about the VPN session details, such as the remote IP address used:

client-connect  /etc/openvpn/server.client-connect

After the upgrade the notifications stopped working and I started investigating what was going wrong.

The following lines came up in system’s journal when a client was connecting:

Dec 19 11:32:38 vpn exim[19804]: 2017-10-30 11:32:38 setrlimit(RLIMIT_NPROC) failed: Operation not permitted
Dec 19 11:32:38 vpn exim[19804]: 2017-10-30 11:32:38 Cannot open main log file "/var/log/exim4/mainlog": Permission denied: euid=0 egid=0

So what exim has to do with openvpn and what’s ‘setrlimit’ anyway?

The executable script contains a line like this:

printf "some message" | mail -s "Subject" "recipient@noc.grnet.gr"

What is this mail command doing? It’s a symbolic to heirloom-mailx:

➜ vpn.server /etc/openvpn  # ls -l /usr/bin/mail
lrwxrwxrwx 1 root root 22 Apr 12  2011 /usr/bin/mail -> /etc/alternatives/mail
➜ vpn.server /etc/openvpn  # ls -l /etc/alternatives/mail
lrwxrwxrwx 1 root root 23 Dec 20 12:19 /etc/alternatives/mail -> /usr/bin/heirloom-mailx

So heirloom-mailx: “Normally, mailx invokes sendmail(8) directly to transfer messages.” Where sendmail on our openvpn server points to exim4:

➜ vpn.server /etc/openvpn  # ls -l /usr/sbin/sendmail
lrwxrwxrwx 1 root root 5 Jun 14  2017 /usr/sbin/sendmail -> exim4

meaning openvpn will spawn a exim4 child process.

Let’s get to ‘setrlimit’. ‘setrlimit’ is a system call from coming from the exim4 child process, http://man7.org/linux/man-pages/man2/setrlimit.2.html .

Are all system calls always available to userspace processes? No, in fact available system calls may be restricted by using linux capabilities: https://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.2/capfaq-0.2.txt

Which capability does setrlimit need? According to manpage:

The RLIMIT_NPROC limit is not enforced for processes that have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE capability.

We may assume that the child has not the proper capabilities so as to limit its resources.

Who sets the capabilities of a process? Capabilities may either be set staticly to the binary file ( this is not our case) or by systemd which originally spawned the openvpn process.

Taking a look at the openvpn systemd unit we indeed observe:

CapabilityBoundingSet=CAP_IPC_LOCK CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_CHROOT CAP_DAC_READ_SEARCH CAP_AUDIT_WRITE
"CapabilityBoundingSet= , Controls which capabilities to include in the capability bounding set for the executed process. See capabilities(7) for details."

This is in fact among the changes introduced in version 2.4:

https://sources.debian.org/src/openvpn/2.3.4-5+deb8u2/debian/openvpn%40.service/ https://sources.debian.org/src/openvpn/2.4.0-6+deb9u1~bpo8+1/debian/openvpn%40.service/

I went ahead and added “CAP_SYS_RESOURCE” capability in “CapabilityBoundingSet”, issued ‘systemctl daemon-reload’ and restarted the openvpn service.

After this change and upon vpn client connection the ‘setrlimit operation not permitted’ vanished and another message showed in place:

Dec 19 20:35:00 vpn exim[24958]: 2017-12-19 20:35:00 1eRMjD-0006UW-Ub Spool error for /var/spool/exim4/input//1eRMjD-0006UW-Ub-D: Permission denied Dec 19 20:35:00 vpn exim[24958]: 2017-12-19 20:35:00 1eRMjD-0006UW-Ub Cannot open main log file "/var/log/exim4/mainlog": Permission denied: euid=0 egid=113
Dec 19 20:35:00 vpn exim[24958]: exim: could not open panic log - aborting: see message(s) above

Some progress made.

Checking /var/spool/exim4 directory showed that the process had successfully written the needed files with correct permissions. But for some reason these files were not processed and procuded the former error message.

➜ vpn.server spool/exim4/input  # ls -l
total 8
-rw-r----- 1 Debian-exim Debian-exim 121 Dec 20 13:55 1eRcxi-0003aS-Ag-D
-rw-r----- 1 Debian-exim Debian-exim 859 Dec 20 13:55 1eRcxi-0003aS-Ag-H
➜ vpn.server spool/exim4/input  # cat 1eRcxi-0003aS-Ag-D
1eRcxi-0003aS-Ag-D
VPN server: vpn.server:1194 (udp)
User: alexaf
From: 10.0.28.253:57085
Address: 10.0.9.123

As these files remained in spool, the running exim4 daemon would eventually process them and send the emails, but with some delay.

This behavior was quite similar with this ancient debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=72741 , but in our case both the exim4 binary has the root suid bit set and the openvpn service had the CAP_SETUID capability set.

It seemed though this was caused by capabilities restrictions too, since if one disabled the restriction in systemd unit, the problem went away.

I finally decided to simply change the mail command’s behavior within the executable. Namely to submit the message with SMTP to the exim4 daemon listening to localhost instead of letting the child process to process it.

printf "some message" | mail -s "Subject"  -S smtp=smtp://127.0.0.1:25 "recipient@noc.grnet.gr"

This way there is no need to mess with the capabilities and differentiate from the openvpn systemd service unit.

See also