perl

Running a program with a specific environment, uid/gid and arguments: withidenvargs

In my Tomcat under daemontools hack, I alluded to withidenvargs, a small utility I have written to run a program in a fully-specified environment. This gives you more control than djb's envdir (no newline limitations or '\0' translation hacks in environment variables, plus the ability to specify uid/gid), though in principle I love djb's philosophy of using the filesystem as a parser. Unlike envdir, withidenvargs doesn't inherit the parent's exports.

withidenvargs takes 3 filenames as arguments

  1. an ids file (newline-separated list of real / effective uid, real gid, effective gid + sgid's — see below) as produced by perl -le '$, = "\n"; print $<, $>, 0+$(, $)'
  2. an environment file (null-separated list of assignments as produced by env -0)
  3. an arguments file (null-separated list of arguments starting with the program name, as in C's argv
#!/usr/bin/env perl
use strict; use warnings;
use File::Slurp;
scalar @ARGV == 3 || die;
my ($fid_name, $fenv_name, $farg_name) = @ARGV;
my @ids = read_file ($fid_name);
@ids || die $!; scalar @ids >= 4 || die;
map { chomp } @ids;
my ($uid, $euid, $gid, $egid) = @ids;
$/ = chr (0);
%ENV = ();
open my $fenv, "<", $fenv_name || die $!;
while (<$fenv>) {
  my ($k, $v) = split /=/, $_, 2; $ENV {$k} = $v;
  chdir ($v) if $k eq "PWD";
}
close ($fenv);
open my $farg, "<", $farg_name || die $!;
my @args = ();
while (<$farg>) { push @args, $_ }
close ($farg);
# handle empty sgid list -- setgroups ([])
$egid = "$egid $egid" unless $egid =~ / /;
if ($uid =~ /[^0-9]/) {  # translate to numeric id's if necessary
  ($uid, $euid) = (getpwnam ($uid), getpwnam ($euid));
  $gid = getgrnam ($gid);
  $egid = join " ", map { $_ = getgrnam ($_) } (split / /, $egid)
}
$( = "$gid"; $) = "$egid";
$< = $uid; $> = $euid;
#system ("id");  # check
exec @args;

Note: It is important to change the real and effective gid before the uid's, because otherwise the process may “loose root” and be unable to perform privilleged operations.

Also interesting is Perl's way of setting the supplementary group IDs. A process can have, in addition to its read and effective gid, several supplimentary gid's (sgid's); at system level they are controlled using the POSIX getgroups() and the non-POSIX setgroups(). A less-known form of $)-assignment accepts a list of gid's, the first of which signifies the egid, while the rest compose the sgid list. To specify an empty supplimentary list, however, we must repeat the egid (leaving out the second egid changes the meaning of the $)-assignment to egid-only, which would leave the inherited sgid's unchanged, possibly leaking root privilleges to the child process).

As a quirk, withidenvargs picks up the working directory from which it launches the program from the $PWD environment variable. This may or may not be what you want.

Unwrapping control scripts part II: restoring the complete environment (Tomcat)

In the previous episode we dealt with restoring only a few variables, though there was the complication of two levels of indirection (service apache2 start and apache2ctl). When placing Tomcat under the control of daemontools, there is a single indirection (service tomcat7 start calls catalina.sh) but the environment has more complex variable values and includes running under a different UNIX uid / gid as well.

The first step is to set up the fake init.d script and replace the simple /usr/bin/env environment dumper with somehting less easily fooled. env -0 is not subject to variable values that contain newlines or "=". We also record the real and effective user and group id's:

NAME=tomcat7
# rewrite /etc/init.d/ script
cp "/etc/init.d/${NAME}" "/tmp/initd_${NAME}_fake"
perl -pi.bak -e 's@(CATALINA_SH=).*@$1"/tmp/catalina_fake"@;s@(CATALINA_PID=")/var/run/@$1/tmp/@' \
  "/tmp/initd_${NAME}_fake"
# create stub
cat <<"EOF" | perl -pe "s@NAME@$NAME@g" >/tmp/catalina_fake
#!/bin/sh
# real / effective uid, real / effective gid + sgid's
perl -le '$, = "\n"; print $<, $>, 0+$(, $)'>"/tmp/NAME_id.txt"
# args, including program name, $0
perl -e '$\ = chr (0); print $0; print while defined ($_ = shift)' "\$@" >"/tmp/NAME_args.txt"
# finally env
/usr/bin/env -0 >"/tmp/NAME_env.txt"
EOF
# execute fake init.d script
chmod a+x /tmp/catalina_fake "/tmp/initd_${NAME}_fake"
"/tmp/initd_${NAME}_fake" start >/dev/null 2>&1

We saved catalina's arguments (including the full path to the real catalina.sh) to a tomcat7_args file for demonstration purposes; we will actually overwrite this file, because unlike the init.d script, we want to invoke catalina.sh run (which runs in the foreground), not catalina.sh start (which daemonizes). The last part of the “unwrapped” script extracts catalina's path from the saved CLI arguments and calls it in the appropriate environment, with the help of a little utility (withidenvargs) that I will describe in my next post:

CATALINA=$(perl -0e '$_ = <>; chomp; print' "/tmp/${NAME}_args.txt")
printf "%s\0%s\0" "$CATALINA" run >"/tmp/${NAME}_args.txt"
exec withidenvargs "/tmp/${NAME}_id.txt" "/tmp/${NAME}_env.txt" "/tmp/${NAME}_args.txt"

Unwrapping control scripts: Apache under daemontools

In Debian, if you start apache The Right Way, you're actually going through two indirection layers: /etc/init.d/apache2 start sets up some environment variables (e.g. by reading /etc/default/apache2) and eventually runs apache2ctl start — which again sets up some stuff and eventually runs apache2. You can't really safely skip either of them.

This poses some problems in case you want to run apache2 non-daemonized (in the foreground), say under the watchful eye of a process supervisor like daemontools (or runit, or s6, or any of the other clones / enhancements). We all know that apache never crashes and never segfaults, so there's no need to auto-restart it, but still.

We want to run apache2 in the exact environment that /etc/init.d/apache2 start and apache2ctl start create. You could stare at the scripts and extract environment variables by hand, but this is time-consuming and error-prone. The elegant way to replicate the actions of the scripts is to replace the final call to apache2 with a stub that saves the complete environment, and then exec /usr/sbin/apache2 in that environment from the daemontools run script. To achieve this, one can rewrite apache2ctl (call it apache2ctl_fake) to invoke our stub, then rewrite /etc/init.d/apache2 to invoke apache2ctl_fake instead of the real apache2ctl. The stub itself can simply use env to dump the environment into a file. Putting all this together, we get

#!/bin/sh
exec 2>&1
NAME=apache2
# rewrite /etc/init.d/ script
cp "/etc/init.d/${NAME}" "/tmp/initd_${NAME}_fake"
perl -pi.bak -e \
  's@APACHE2CTL( start)@ENV /tmp/apache2ctl_fake$1@' \
  "/tmp/initd_${NAME}_fake"
# rewrite apache2ctl
{ echo '#!/bin/sh'; echo "APACHE_HTTPD=/tmp/${NAME}_fake";
  cat `which apache2ctl`; } >"/tmp/${NAME}ctl_fake"
# create stub
cat <<EOF >"/tmp/${NAME}_fake"
#!/bin/sh
/usr/bin/env >"/tmp/${NAME}_env.txt"
EOF
# execute fake init.d script
chmod a+x "/tmp/${NAME}_fake" "/tmp/${NAME}ctl_fake"
chmod a+x "/tmp/initd_${NAME}_fake"
"/tmp/initd_${NAME}_fake" start >/dev/null 2>&1
# prefix all encironment assignments with export
perl -ni.bak -e 's/^/export /; print unless /^export PWD=/' \
  "/tmp/${NAME}_env.txt"
# load environment
. "/tmp/${NAME}_env.txt"
# call the real apache2
exec /usr/sbin/apache2 -k start -DNO_DETACH -DNO_DAEMONIZE

Note that the final crude “environment reload” trick only works for environment variables with no spaces in their values, because env does not quote assignments and/or escape quotes, i.e. it doesn't output VAR="value with \"nasty\" stuff". For more thorough handling one could generate output in the style of daemontools' envdir and use that tool to exec apache2.