[pacman-dev,PATCHv2] Add optional 'SandboxUser' option to drop privileges before downloading files

Message ID 20210905124248.5354-1-rgacogne@archlinux.org
State New
Headers show
Series [pacman-dev,PATCHv2] Add optional 'SandboxUser' option to drop privileges before downloading files | expand

Commit Message

Remi Gacogne Sept. 5, 2021, 12:42 p.m. UTC
This second version contains only the portable part of the sandboxing,
switching to a different user when a new option, 'SandboxUser', is set.

To sum up the changes from the last version:
- The non-portable parts (preventing new privileges from being gained, dropping
  capabilities, syscalls filtering and filesystem protection) are gone for now,
  I'll propose them via smaller patches if this change is accepted.
- The value returned by the sub-process is now passed by the exit status only,
  instead of using a pipe.
- The 'UseSandbox' option has been removed.
- The alpm_sandbox.{c,h} files have been renamed to sandbox.{c,h}.
- A short description of the 'SandboxUser' option has been added to
  pacman.conf.5.

Cheers,

Remi

Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
---
 doc/pacman.conf.5.asciidoc |   5 ++
 lib/libalpm/alpm.h         |   5 ++
 lib/libalpm/dload.c        | 102 ++++++++++++++++++++++++++++++++++++-
 lib/libalpm/handle.c       |  13 +++++
 lib/libalpm/handle.h       |   1 +
 lib/libalpm/meson.build    |   1 +
 lib/libalpm/sandbox.c      |  63 +++++++++++++++++++++++
 lib/libalpm/sandbox.h      |  31 +++++++++++
 src/pacman/conf.c          |  19 ++++++-
 src/pacman/conf.h          |   1 +
 src/pacman/pacman-conf.c   |   3 ++
 11 files changed, 241 insertions(+), 3 deletions(-)
 create mode 100644 lib/libalpm/sandbox.c
 create mode 100644 lib/libalpm/sandbox.h

Comments

Allan McRae Sept. 5, 2021, 1:59 p.m. UTC | #1
On 5/9/21 10:42 pm, Remi Gacogne wrote:
> This second version contains only the portable part of the sandboxing,
> switching to a different user when a new option, 'SandboxUser', is set.
> 
> To sum up the changes from the last version:
> - The non-portable parts (preventing new privileges from being gained, dropping
>   capabilities, syscalls filtering and filesystem protection) are gone for now,
>   I'll propose them via smaller patches if this change is accepted.
> - The value returned by the sub-process is now passed by the exit status only,
>   instead of using a pipe.
> - The 'UseSandbox' option has been removed.
> - The alpm_sandbox.{c,h} files have been renamed to sandbox.{c,h}.
> - A short description of the 'SandboxUser' option has been added to
>   pacman.conf.5.
> 
> Cheers,
> 
> Remi
> 
> Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>

I'm going to do a very quick review so I can get a good overview of the
patch given it is still quite large.  This will likely not be the final
comments as I drill down into this in more detail later.

Edit from Allan from 1 hour in the future!
After reading the patch, I am fairly happy with its content. More
time/eyes are needed on curl_download_internal_sandboxed().

> ---
>  doc/pacman.conf.5.asciidoc |   5 ++
>  lib/libalpm/alpm.h         |   5 ++
>  lib/libalpm/dload.c        | 102 ++++++++++++++++++++++++++++++++++++-
>  lib/libalpm/handle.c       |  13 +++++
>  lib/libalpm/handle.h       |   1 +
>  lib/libalpm/meson.build    |   1 +
>  lib/libalpm/sandbox.c      |  63 +++++++++++++++++++++++
>  lib/libalpm/sandbox.h      |  31 +++++++++++
>  src/pacman/conf.c          |  19 ++++++-
>  src/pacman/conf.h          |   1 +
>  src/pacman/pacman-conf.c   |   3 ++
>  11 files changed, 241 insertions(+), 3 deletions(-)
>  create mode 100644 lib/libalpm/sandbox.c
>  create mode 100644 lib/libalpm/sandbox.h
> 
> diff --git doc/pacman.conf.5.asciidoc doc/pacman.conf.5.asciidoc
> index 77a3907f..5b20b177 100644
> --- doc/pacman.conf.5.asciidoc
> +++ doc/pacman.conf.5.asciidoc
> @@ -207,6 +207,11 @@ Options
>  	positive integer. If this config option is not set then only one download
>  	stream is used (i.e. downloads happen sequentially).
>  
> +*SandboxUser =* username::
> +	Specifies the user to switch to for sensitive operations, like downloading
> +	files. That user should exist on the system and have the permissions to
> +	write to the files located in `DBPath`. If this config option is not set
> +	then these operations are done as the user running pacman.

What are other process that you think we can add later?  I think GPG
checking would be a good addition.

>  Repository Sections
>  -------------------
> diff --git lib/libalpm/alpm.h lib/libalpm/alpm.h
> index a5f4a6ae..5473558a 100644
> --- lib/libalpm/alpm.h
> +++ lib/libalpm/alpm.h
> @@ -1837,6 +1837,11 @@ int alpm_option_set_gpgdir(alpm_handle_t *handle, const char *gpgdir);
>  /* End of gpdir accessors */
>  /** @} */
>  
> +/** Sets the user to switch to for sensitive operations like downloading a file.
> + * @param handle the context handle
> + * @param sandboxuser the user to set
> + */
> +int alpm_option_set_sandboxuser(alpm_handle_t *handle, const char *sandboxuser);

I'd like a matching "get" option too.  That is likely useful for
configuration being done in graphical alpm frontends.

>  
>  /** @name Accessors for use syslog
>   *
> diff --git lib/libalpm/dload.c lib/libalpm/dload.c
> index 4322318b..03e74ae2 100644
> --- lib/libalpm/dload.c
> +++ lib/libalpm/dload.c
> @@ -28,6 +28,7 @@
>  #include <sys/time.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
> +#include <sys/wait.h>
>  #include <signal.h>
>  
>  #ifdef HAVE_NETINET_IN_H
> @@ -46,6 +47,7 @@
>  #include "alpm_list.h"
>  #include "alpm.h"
>  #include "log.h"
> +#include "sandbox.h"
>  #include "util.h"
>  #include "handle.h"
>  

OK

> @@ -937,6 +939,100 @@ static int curl_download_internal(alpm_handle_t *handle,
>  
>  #endif
>  
> +/* Download the requested files by launching a process inside a sandbox.
> + * Returns -1 if an error happened for a required file
> + * Returns 0 if a payload was actually downloaded
> + * Returns 1 if no files were downloaded and all errors were non-fatal
> + */
> +static int curl_download_internal_sandboxed(alpm_handle_t *handle,
> +		alpm_list_t *payloads /* struct dload_payload */,
> +		const char *localpath)
> +{
> +	int pid, err = 0, ret = -1;
> +	sigset_t oldblock;
> +	struct sigaction sa_ign = { .sa_handler = SIG_IGN }, oldint, oldquit;
> +
> +	sigaction(SIGINT, &sa_ign, &oldint);
> +	sigaction(SIGQUIT, &sa_ign, &oldquit);
> +	sigaddset(&sa_ign.sa_mask, SIGCHLD);
> +	sigprocmask(SIG_BLOCK, &sa_ign.sa_mask, &oldblock);
> +
> +	pid = fork();
> +
> +	/* child */
> +	if(pid == 0) {
> +		/* restore signal handling for the child to inherit */
> +		sigaction(SIGINT, &oldint, NULL);
> +		sigaction(SIGQUIT, &oldquit, NULL);
> +		sigprocmask(SIG_SETMASK, &oldblock, NULL);
> +
> +		/* cwd to the download directory */
> +		ret = chdir(localpath);
> +		if(ret != 0) {
> +			_alpm_log(handle, ALPM_LOG_WARNING, _("could not chdir to download directory %s\n"), localpath);
> +			ret = -1;

Seem error is more appropriate than warning here.

> +		} else {
> +			ret = alpm_sandbox_child(handle->sandboxuser);
> +			if (ret != 0) {
> +				_alpm_log(handle, ALPM_LOG_WARNING, _("sandboxing failed!\n"));
> +			}
> +
> +			ret = curl_download_internal(handle, payloads, localpath);

If we fail to sandbox, we should definitely not continue.  Error!

> +		}
> +
> +		/* pass the result back to the parent */
> +		if(ret == 0) {
> +			/* a payload was actually downloaded */
> +			_Exit(0);

Hrm... exit() or _Exit()?

> +		}
> +		else if(ret == 1) {
> +			/* no files were downloaded and all errors were non-fatal */
> +			_Exit(1);
> +		}
> +		else {
> +			/* an error happened for a required file */
> +			_Exit(2);
> +		}
> +	}
> +
> +	/* parent */
> +
> +	if(pid != -1)  {
> +		int wret;
> +		while((wret = waitpid(pid, &ret, 0)) == -1 && errno == EINTR);
> +		if(wret > 0) {
> +			if(!WIFEXITED(ret)) {
> +				/* the child did not terminate normally */
> +				ret = -1;
> +			}
> +			else {
> +				ret = WIFEXITED(ret);
> +				if(ret != 0 && ret != 1) {
> +					/* an error happened for a required file, or unexpected exit status */
> +					ret = -1;
> +				}
> +			}
> +		}
> +		else {
> +			/* waitpid failed */
> +			err = errno;
> +		}
> +	} else {
> +		/* fork failed, make sure errno is preserved after cleanup */
> +		err = errno;
> +	}
> +
> +	sigaction(SIGINT, &oldint, NULL);
> +	sigaction(SIGQUIT, &oldquit, NULL);
> +	sigprocmask(SIG_SETMASK, &oldblock, NULL);
> +
> +	if(err) {
> +		errno = err;
> +		ret = -1;
> +	}
> +	return ret;
> +}
> +

That looks fine on first glance.

>  /* Returns -1 if an error happened for a required file
>   * Returns 0 if a payload was actually downloaded
>   * Returns 1 if no files were downloaded and all errors were non-fatal
> @@ -947,7 +1043,11 @@ int _alpm_download(alpm_handle_t *handle,
>  {
>  	if(handle->fetchcb == NULL) {
>  #ifdef HAVE_LIBCURL
> -		return curl_download_internal(handle, payloads, localpath);
> +		if(handle->sandboxuser) {
> +			return curl_download_internal_sandboxed(handle, payloads, localpath);
> +		} else {
> +			return curl_download_internal(handle, payloads, localpath);
> +		}
>  #else
>  		RET_ERR(handle, ALPM_ERR_EXTERNAL_DOWNLOAD, -1);
>  #endif

OK

> diff --git lib/libalpm/handle.c lib/libalpm/handle.c
> index e6b683cb..5c914faa 100644
> --- lib/libalpm/handle.c
> +++ lib/libalpm/handle.c
> @@ -573,6 +573,19 @@ int SYMEXPORT alpm_option_set_gpgdir(alpm_handle_t *handle, const char *gpgdir)
>  	return 0;
>  }
>  
> +int SYMEXPORT alpm_option_set_sandboxuser(alpm_handle_t *handle, const char *sandboxuser)
> +{
> +	CHECK_HANDLE(handle, return -1);
> +	if(handle->sandboxuser) {
> +		FREE(handle->sandboxuser);
> +	}
> +
> +	STRDUP(handle->sandboxuser, sandboxuser, RET_ERR(handle, ALPM_ERR_MEMORY, -1));
> +
> +	_alpm_log(handle, ALPM_LOG_DEBUG, "option 'sandboxuser' = %s\n", handle->sandboxuser);
> +	return 0;
> +}
> +

OK

>  int SYMEXPORT alpm_option_set_usesyslog(alpm_handle_t *handle, int usesyslog)
>  {
>  	CHECK_HANDLE(handle, return -1);
> diff --git lib/libalpm/handle.h lib/libalpm/handle.h
> index b1526c67..e0587fc7 100644
> --- lib/libalpm/handle.h
> +++ lib/libalpm/handle.h
> @@ -90,6 +90,7 @@ struct __alpm_handle_t {
>  	char *logfile;           /* Name of the log file */
>  	char *lockfile;          /* Name of the lock file */
>  	char *gpgdir;            /* Directory where GnuPG files are stored */
> +	char *sandboxuser;       /* User to switch to for sensitive operations like downloading files */
>  	alpm_list_t *cachedirs;  /* Paths to pacman cache directories */
>  	alpm_list_t *hookdirs;   /* Paths to hook directories */
>  	alpm_list_t *overwrite_files; /* Paths that may be overwritten */

OK

> diff --git lib/libalpm/meson.build lib/libalpm/meson.build
> index 607e91a3..224fbf5f 100644
> --- lib/libalpm/meson.build
> +++ lib/libalpm/meson.build
> @@ -24,6 +24,7 @@ libalpm_sources = files('''
>    pkghash.h pkghash.c
>    rawstr.c
>    remove.h remove.c
> +  sandbox.h sandbox.c
>    signing.c signing.h
>    sync.h sync.c
>    trans.h trans.c

OK

> diff --git lib/libalpm/sandbox.c lib/libalpm/sandbox.c
> new file mode 100644
> index 00000000..3c491176
> --- /dev/null
> +++ lib/libalpm/sandbox.c
> @@ -0,0 +1,63 @@
> +/*
> + *  sandbox.c
> + *
> + *  Copyright (c) 2021 Pacman Development Team <pacman-dev@archlinux.org>
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <grp.h>
> +#include <pwd.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "sandbox.h"
> +
> +static int switch_to_user(const char *user)
> +{
> +	struct passwd const *pw = NULL;
> +	if(getuid() != 0) {
> +		return 1;
> +	}
> +	pw = getpwnam(user);

Does getpwnam handle a NULL user?

> +	if(pw == NULL) {
> +		return errno;
> +	}
> +	if(setgid(pw->pw_gid) != 0) {
> +		return errno;
> +	}
> +	if(setgroups(0, NULL)) {
> +		return errno;
> +	}
> +	if(setuid(pw->pw_uid) != 0) {
> +		return errno;
> +	}

this could make use of our ASSERT define

ASSERT(getuid() != 0, return 1);
ASSERT((pw = getpwnam(user)), return 1);
...

> +	return 0;
> +}
> +
> +/* check exported library symbols with: nm -C -D <lib> */
> +#define SYMEXPORT __attribute__((visibility("default")))
> +
> +int SYMEXPORT alpm_sandbox_child(const char* sandboxuser)
> +{
> +	int result = 0;
> +
> +	if(sandboxuser != NULL) {
> +		result = switch_to_user(sandboxuser);
> +	}
> +
> +	return result;
> +}

OK

> diff --git lib/libalpm/sandbox.h lib/libalpm/sandbox.h
> new file mode 100644
> index 00000000..47611575
> --- /dev/null
> +++ lib/libalpm/sandbox.h
> @@ -0,0 +1,31 @@
> +/*
> + *  sandbox.h
> + *
> + *  Copyright (c) 2021 Pacman Development Team <pacman-dev@archlinux.org>
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef ALPM_SANDBOX_H
> +#define ALPM_SANDBOX_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +

No need for that - this header is not installed

> +int alpm_sandbox_child(const char *sandboxuser);
> +

If this is needed externally to libalpm, it should be defined in alpm.h

> +#ifdef __cplusplus
> +}
> +#endif
> +#endif /* ALPM_SANDBOX_H */
> diff --git src/pacman/conf.c src/pacman/conf.c
> index 12fee64c..15d05e8c 100644
> --- src/pacman/conf.c
> +++ src/pacman/conf.c
> @@ -33,6 +33,8 @@
>  #include <unistd.h>
>  #include <signal.h>
>  
> +#include <sandbox.h>
> +

All public functions are available via alpm.h

>  /* pacman */
>  #include "conf.h"
>  #include "ini.h"
> @@ -215,7 +217,7 @@ static char *get_tempfile(const char *path, const char *filename)
>   * - not thread-safe
>   * - errno may be set by fork(), pipe(), or execvp()
>   */
> -static int systemvp(const char *file, char *const argv[])
> +static int systemvp(const char *file, char *const argv[], const char *sandboxuser)
>  {
>  	int pid, err = 0, ret = -1, err_fd[2];
>  	sigset_t oldblock;
> @@ -242,6 +244,13 @@ static int systemvp(const char *file, char *const argv[])
>  		sigaction(SIGQUIT, &oldquit, NULL);
>  		sigprocmask(SIG_SETMASK, &oldblock, NULL);
>  
> +		if (sandboxuser) {
> +			ret = alpm_sandbox_child(sandboxuser);
> +			if (ret != 0) {
> +				pm_printf(ALPM_LOG_WARNING, _("Switching to sandbox user %s failed!\n"), sandboxuser);

Error.

> +			}
> +		}
> +
>  		execvp(file, argv);
>  
>  		/* execvp failed, pass the error back to the parent */
> @@ -352,7 +361,7 @@ static int download_with_xfercommand(void *ctx, const char *url,
>  			free(cmd);
>  		}
>  	}
> -	retval = systemvp(argv[0], (char**)argv);
> +	retval = systemvp(argv[0], (char**)argv, config->sandboxuser);

OK

Which reminds me to split XferCommand support out of pacman/conf.h...
Why is it there!

>  
>  	if(retval == -1) {
>  		pm_printf(ALPM_LOG_WARNING, _("running XferCommand: fork failed!\n"));
> @@ -668,6 +677,11 @@ static int _parse_options(const char *key, char *value,
>  				config->logfile = strdup(value);
>  				pm_printf(ALPM_LOG_DEBUG, "config: logfile: %s\n", value);
>  			}
> +		} else if(strcmp(key, "SandboxUser") == 0) {
> +			if(!config->sandboxuser) {
> +				config->sandboxuser = strdup(value);
> +				pm_printf(ALPM_LOG_DEBUG, "config: sandboxuser: %s\n", value);
> +			}
>  		} else if(strcmp(key, "XferCommand") == 0) {
>  			char **c;
>  			if((config->xfercommand_argv = wordsplit(value)) == NULL) {

OK

> @@ -904,6 +918,7 @@ static int setup_libalpm(void)
>  	alpm_option_set_architectures(handle, config->architectures);
>  	alpm_option_set_checkspace(handle, config->checkspace);
>  	alpm_option_set_usesyslog(handle, config->usesyslog);
> +	alpm_option_set_sandboxuser(handle, config->sandboxuser);
>  
>  	alpm_option_set_ignorepkgs(handle, config->ignorepkg);
>  	alpm_option_set_ignoregroups(handle, config->ignoregrp);
> diff --git src/pacman/conf.h src/pacman/conf.h
> index 04350d39..675e06fb 100644
> --- src/pacman/conf.h
> +++ src/pacman/conf.h
> @@ -67,6 +67,7 @@ typedef struct __config_t {
>  	char *logfile;
>  	char *gpgdir;
>  	char *sysroot;
> +	char *sandboxuser;
>  	alpm_list_t *hookdirs;
>  	alpm_list_t *cachedirs;
>  	alpm_list_t *architectures;
> diff --git src/pacman/pacman-conf.c src/pacman/pacman-conf.c
> index 600f1622..05a6bcd8 100644
> --- src/pacman/pacman-conf.c
> +++ src/pacman/pacman-conf.c
> @@ -251,6 +251,7 @@ static void dump_config(void)
>  	show_list_str("HookDir", config->hookdirs);
>  	show_str("GPGDir", config->gpgdir);
>  	show_str("LogFile", config->logfile);
> +	show_str("SandboxUser", config->sandboxuser);
>  
>  	show_list_str("HoldPkg", config->holdpkg);
>  	show_list_str("IgnorePkg", config->ignorepkg);
> @@ -349,6 +350,8 @@ static int list_directives(void)
>  			show_str("GPGDir", config->gpgdir);
>  		} else if(strcasecmp(i->data, "LogFile") == 0) {
>  			show_str("LogFile", config->logfile);
> +		} else if(strcasecmp(i->data, "SandboxUser") == 0) {
> +			show_str("SandboxUser", config->sandboxuser);
>  
>  		} else if(strcasecmp(i->data, "HoldPkg") == 0) {
>  			show_list_str("HoldPkg", config->holdpkg);
> 

OK
Andrew Gregory Sept. 6, 2021, 12:42 a.m. UTC | #2
On 09/05/21 at 02:42pm, Remi Gacogne wrote:
> This second version contains only the portable part of the sandboxing,
> switching to a different user when a new option, 'SandboxUser', is set.
> 
> To sum up the changes from the last version:
> - The non-portable parts (preventing new privileges from being gained, dropping
>   capabilities, syscalls filtering and filesystem protection) are gone for now,
>   I'll propose them via smaller patches if this change is accepted.
> - The value returned by the sub-process is now passed by the exit status only,
>   instead of using a pipe.
> - The 'UseSandbox' option has been removed.
> - The alpm_sandbox.{c,h} files have been renamed to sandbox.{c,h}.
> - A short description of the 'SandboxUser' option has been added to
>   pacman.conf.5.
> 
> Cheers,
> 
> Remi
> 
> Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
> ---

Put notes here to avoid including them in the commit message.

>  doc/pacman.conf.5.asciidoc |   5 ++
>  lib/libalpm/alpm.h         |   5 ++
>  lib/libalpm/dload.c        | 102 ++++++++++++++++++++++++++++++++++++-
>  lib/libalpm/handle.c       |  13 +++++
>  lib/libalpm/handle.h       |   1 +
>  lib/libalpm/meson.build    |   1 +
>  lib/libalpm/sandbox.c      |  63 +++++++++++++++++++++++
>  lib/libalpm/sandbox.h      |  31 +++++++++++
>  src/pacman/conf.c          |  19 ++++++-
>  src/pacman/conf.h          |   1 +
>  src/pacman/pacman-conf.c   |   3 ++
>  11 files changed, 241 insertions(+), 3 deletions(-)
>  create mode 100644 lib/libalpm/sandbox.c
>  create mode 100644 lib/libalpm/sandbox.h
 
...

> +/* Download the requested files by launching a process inside a sandbox.
> + * Returns -1 if an error happened for a required file
> + * Returns 0 if a payload was actually downloaded
> + * Returns 1 if no files were downloaded and all errors were non-fatal
> + */
> +static int curl_download_internal_sandboxed(alpm_handle_t *handle,
> +		alpm_list_t *payloads /* struct dload_payload */,
> +		const char *localpath)
> +{
> +	int pid, err = 0, ret = -1;
> +	sigset_t oldblock;
> +	struct sigaction sa_ign = { .sa_handler = SIG_IGN }, oldint, oldquit;
> +
> +	sigaction(SIGINT, &sa_ign, &oldint);
> +	sigaction(SIGQUIT, &sa_ign, &oldquit);
> +	sigaddset(&sa_ign.sa_mask, SIGCHLD);
> +	sigprocmask(SIG_BLOCK, &sa_ign.sa_mask, &oldblock);
> +
> +	pid = fork();
> +
> +	/* child */
> +	if(pid == 0) {
> +		/* restore signal handling for the child to inherit */
> +		sigaction(SIGINT, &oldint, NULL);
> +		sigaction(SIGQUIT, &oldquit, NULL);
> +		sigprocmask(SIG_SETMASK, &oldblock, NULL);
> +
> +		/* cwd to the download directory */
> +		ret = chdir(localpath);
> +		if(ret != 0) {
> +			_alpm_log(handle, ALPM_LOG_WARNING, _("could not chdir to download directory %s\n"), localpath);
> +			ret = -1;
> +		} else {
> +			ret = alpm_sandbox_child(handle->sandboxuser);
> +			if (ret != 0) {
> +				_alpm_log(handle, ALPM_LOG_WARNING, _("sandboxing failed!\n"));
> +			}
> +
> +			ret = curl_download_internal(handle, payloads, localpath);
> +		}
> +
> +		/* pass the result back to the parent */
> +		if(ret == 0) {
> +			/* a payload was actually downloaded */
> +			_Exit(0);
> +		}
> +		else if(ret == 1) {
> +			/* no files were downloaded and all errors were non-fatal */
> +			_Exit(1);
> +		}
> +		else {
> +			/* an error happened for a required file */
> +			_Exit(2);
> +		}
> +	}
> +
> +	/* parent */
> +
> +	if(pid != -1)  {
> +		int wret;
> +		while((wret = waitpid(pid, &ret, 0)) == -1 && errno == EINTR);
> +		if(wret > 0) {
> +			if(!WIFEXITED(ret)) {
> +				/* the child did not terminate normally */
> +				ret = -1;
> +			}
> +			else {
> +				ret = WIFEXITED(ret);
> +				if(ret != 0 && ret != 1) {
> +					/* an error happened for a required file, or unexpected exit status */
> +					ret = -1;
> +				}
> +			}
> +		}
> +		else {
> +			/* waitpid failed */
> +			err = errno;
> +		}
> +	} else {
> +		/* fork failed, make sure errno is preserved after cleanup */
> +		err = errno;
> +	}
> +
> +	sigaction(SIGINT, &oldint, NULL);
> +	sigaction(SIGQUIT, &oldquit, NULL);
> +	sigprocmask(SIG_SETMASK, &oldblock, NULL);
> +
> +	if(err) {
> +		errno = err;
> +		ret = -1;
> +	}
> +	return ret;
> +}
> +

After thinking about this some more, I think this is far too simple.  Just
running download_internal in an unprivileged fork will break anything that
relies on side effects.  download_internal sets pm_errno, tracks server errors,
and calls a number of front-end callbacks.  Losing server error tracking across
multiple downloads isn't a big deal, but losing pm_errno is significant and we
have no way of knowing what kind of state changes the front-end callbacks might
be making.  I suspect this would massively break GUI front-ends.

apg
Remi Gacogne Sept. 6, 2021, 7:34 a.m. UTC | #3
On 9/6/21 2:42 AM, Andrew Gregory wrote:
> Put notes here to avoid including them in the commit message.

Understood, thanks!

> After thinking about this some more, I think this is far too simple.  Just
> running download_internal in an unprivileged fork will break anything that
> relies on side effects.  download_internal sets pm_errno, tracks server errors,
> and calls a number of front-end callbacks.  Losing server error tracking across
> multiple downloads isn't a big deal, but losing pm_errno is significant and we
> have no way of knowing what kind of state changes the front-end callbacks might
> be making.  I suspect this would massively break GUI front-ends.

Right, that was my main worry when I started working on this, but since 
the 'XferCommand' option existed I was hopeful the download code was not 
too tightly coupled with the rest of the code.
Do you think it would still make sense to keep working on this? It looks 
like we could pass the value of pm_errno back to the main process, using 
a pipe if needed. As for the front-end callbacks I guess we could detect 
that these are set and disable the sandboxing in that case, while 
documenting that this option does not play well with GUI front-ends. I'm 
guessing that's already the case with 'XferCommand?', since the same 
issues likely apply?

Remi
Remi Gacogne Sept. 18, 2021, 3:07 p.m. UTC | #4
On 9/6/21 09:34, Remi Gacogne wrote:
> Right, that was my main worry when I started working on this, but since 
> the 'XferCommand' option existed I was hopeful the download code was not 
> too tightly coupled with the rest of the code.
> Do you think it would still make sense to keep working on this? It looks 
> like we could pass the value of pm_errno back to the main process, using 
> a pipe if needed. As for the front-end callbacks I guess we could detect 
> that these are set and disable the sandboxing in that case, while 
> documenting that this option does not play well with GUI front-ends. I'm 
> guessing that's already the case with 'XferCommand?', since the same 
> issues likely apply?

I looked a bit more into this. Passing pm_errno back is easy, and can 
even be done using the exit status without setting up a pipe.
Unfortunately I realize now that I under-estimated the criticality of 
the front-ends callbacks. Reading that code more in depth quickly 
uncovered issues I did not notice earlier, like 'on_progress' being set 
during the downloads and leading to cb_log adding messages to a list 
that is never flushed when we fork.
I'm not sure how to deal with that. Serializing the content of the dlcb 
events to pass them over a pipe could be done, but that would be a 
rather big patch, and would be cumbersome to maintain. I'll think about 
it a bit more, and in the meantime any ideas are welcome :)

Remi

Patch

diff --git doc/pacman.conf.5.asciidoc doc/pacman.conf.5.asciidoc
index 77a3907f..5b20b177 100644
--- doc/pacman.conf.5.asciidoc
+++ doc/pacman.conf.5.asciidoc
@@ -207,6 +207,11 @@  Options
 	positive integer. If this config option is not set then only one download
 	stream is used (i.e. downloads happen sequentially).
 
+*SandboxUser =* username::
+	Specifies the user to switch to for sensitive operations, like downloading
+	files. That user should exist on the system and have the permissions to
+	write to the files located in `DBPath`. If this config option is not set
+	then these operations are done as the user running pacman.
 
 Repository Sections
 -------------------
diff --git lib/libalpm/alpm.h lib/libalpm/alpm.h
index a5f4a6ae..5473558a 100644
--- lib/libalpm/alpm.h
+++ lib/libalpm/alpm.h
@@ -1837,6 +1837,11 @@  int alpm_option_set_gpgdir(alpm_handle_t *handle, const char *gpgdir);
 /* End of gpdir accessors */
 /** @} */
 
+/** Sets the user to switch to for sensitive operations like downloading a file.
+ * @param handle the context handle
+ * @param sandboxuser the user to set
+ */
+int alpm_option_set_sandboxuser(alpm_handle_t *handle, const char *sandboxuser);
 
 /** @name Accessors for use syslog
  *
diff --git lib/libalpm/dload.c lib/libalpm/dload.c
index 4322318b..03e74ae2 100644
--- lib/libalpm/dload.c
+++ lib/libalpm/dload.c
@@ -28,6 +28,7 @@ 
 #include <sys/time.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <sys/wait.h>
 #include <signal.h>
 
 #ifdef HAVE_NETINET_IN_H
@@ -46,6 +47,7 @@ 
 #include "alpm_list.h"
 #include "alpm.h"
 #include "log.h"
+#include "sandbox.h"
 #include "util.h"
 #include "handle.h"
 
@@ -937,6 +939,100 @@  static int curl_download_internal(alpm_handle_t *handle,
 
 #endif
 
+/* Download the requested files by launching a process inside a sandbox.
+ * Returns -1 if an error happened for a required file
+ * Returns 0 if a payload was actually downloaded
+ * Returns 1 if no files were downloaded and all errors were non-fatal
+ */
+static int curl_download_internal_sandboxed(alpm_handle_t *handle,
+		alpm_list_t *payloads /* struct dload_payload */,
+		const char *localpath)
+{
+	int pid, err = 0, ret = -1;
+	sigset_t oldblock;
+	struct sigaction sa_ign = { .sa_handler = SIG_IGN }, oldint, oldquit;
+
+	sigaction(SIGINT, &sa_ign, &oldint);
+	sigaction(SIGQUIT, &sa_ign, &oldquit);
+	sigaddset(&sa_ign.sa_mask, SIGCHLD);
+	sigprocmask(SIG_BLOCK, &sa_ign.sa_mask, &oldblock);
+
+	pid = fork();
+
+	/* child */
+	if(pid == 0) {
+		/* restore signal handling for the child to inherit */
+		sigaction(SIGINT, &oldint, NULL);
+		sigaction(SIGQUIT, &oldquit, NULL);
+		sigprocmask(SIG_SETMASK, &oldblock, NULL);
+
+		/* cwd to the download directory */
+		ret = chdir(localpath);
+		if(ret != 0) {
+			_alpm_log(handle, ALPM_LOG_WARNING, _("could not chdir to download directory %s\n"), localpath);
+			ret = -1;
+		} else {
+			ret = alpm_sandbox_child(handle->sandboxuser);
+			if (ret != 0) {
+				_alpm_log(handle, ALPM_LOG_WARNING, _("sandboxing failed!\n"));
+			}
+
+			ret = curl_download_internal(handle, payloads, localpath);
+		}
+
+		/* pass the result back to the parent */
+		if(ret == 0) {
+			/* a payload was actually downloaded */
+			_Exit(0);
+		}
+		else if(ret == 1) {
+			/* no files were downloaded and all errors were non-fatal */
+			_Exit(1);
+		}
+		else {
+			/* an error happened for a required file */
+			_Exit(2);
+		}
+	}
+
+	/* parent */
+
+	if(pid != -1)  {
+		int wret;
+		while((wret = waitpid(pid, &ret, 0)) == -1 && errno == EINTR);
+		if(wret > 0) {
+			if(!WIFEXITED(ret)) {
+				/* the child did not terminate normally */
+				ret = -1;
+			}
+			else {
+				ret = WIFEXITED(ret);
+				if(ret != 0 && ret != 1) {
+					/* an error happened for a required file, or unexpected exit status */
+					ret = -1;
+				}
+			}
+		}
+		else {
+			/* waitpid failed */
+			err = errno;
+		}
+	} else {
+		/* fork failed, make sure errno is preserved after cleanup */
+		err = errno;
+	}
+
+	sigaction(SIGINT, &oldint, NULL);
+	sigaction(SIGQUIT, &oldquit, NULL);
+	sigprocmask(SIG_SETMASK, &oldblock, NULL);
+
+	if(err) {
+		errno = err;
+		ret = -1;
+	}
+	return ret;
+}
+
 /* Returns -1 if an error happened for a required file
  * Returns 0 if a payload was actually downloaded
  * Returns 1 if no files were downloaded and all errors were non-fatal
@@ -947,7 +1043,11 @@  int _alpm_download(alpm_handle_t *handle,
 {
 	if(handle->fetchcb == NULL) {
 #ifdef HAVE_LIBCURL
-		return curl_download_internal(handle, payloads, localpath);
+		if(handle->sandboxuser) {
+			return curl_download_internal_sandboxed(handle, payloads, localpath);
+		} else {
+			return curl_download_internal(handle, payloads, localpath);
+		}
 #else
 		RET_ERR(handle, ALPM_ERR_EXTERNAL_DOWNLOAD, -1);
 #endif
diff --git lib/libalpm/handle.c lib/libalpm/handle.c
index e6b683cb..5c914faa 100644
--- lib/libalpm/handle.c
+++ lib/libalpm/handle.c
@@ -573,6 +573,19 @@  int SYMEXPORT alpm_option_set_gpgdir(alpm_handle_t *handle, const char *gpgdir)
 	return 0;
 }
 
+int SYMEXPORT alpm_option_set_sandboxuser(alpm_handle_t *handle, const char *sandboxuser)
+{
+	CHECK_HANDLE(handle, return -1);
+	if(handle->sandboxuser) {
+		FREE(handle->sandboxuser);
+	}
+
+	STRDUP(handle->sandboxuser, sandboxuser, RET_ERR(handle, ALPM_ERR_MEMORY, -1));
+
+	_alpm_log(handle, ALPM_LOG_DEBUG, "option 'sandboxuser' = %s\n", handle->sandboxuser);
+	return 0;
+}
+
 int SYMEXPORT alpm_option_set_usesyslog(alpm_handle_t *handle, int usesyslog)
 {
 	CHECK_HANDLE(handle, return -1);
diff --git lib/libalpm/handle.h lib/libalpm/handle.h
index b1526c67..e0587fc7 100644
--- lib/libalpm/handle.h
+++ lib/libalpm/handle.h
@@ -90,6 +90,7 @@  struct __alpm_handle_t {
 	char *logfile;           /* Name of the log file */
 	char *lockfile;          /* Name of the lock file */
 	char *gpgdir;            /* Directory where GnuPG files are stored */
+	char *sandboxuser;       /* User to switch to for sensitive operations like downloading files */
 	alpm_list_t *cachedirs;  /* Paths to pacman cache directories */
 	alpm_list_t *hookdirs;   /* Paths to hook directories */
 	alpm_list_t *overwrite_files; /* Paths that may be overwritten */
diff --git lib/libalpm/meson.build lib/libalpm/meson.build
index 607e91a3..224fbf5f 100644
--- lib/libalpm/meson.build
+++ lib/libalpm/meson.build
@@ -24,6 +24,7 @@  libalpm_sources = files('''
   pkghash.h pkghash.c
   rawstr.c
   remove.h remove.c
+  sandbox.h sandbox.c
   signing.c signing.h
   sync.h sync.c
   trans.h trans.c
diff --git lib/libalpm/sandbox.c lib/libalpm/sandbox.c
new file mode 100644
index 00000000..3c491176
--- /dev/null
+++ lib/libalpm/sandbox.c
@@ -0,0 +1,63 @@ 
+/*
+ *  sandbox.c
+ *
+ *  Copyright (c) 2021 Pacman Development Team <pacman-dev@archlinux.org>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <grp.h>
+#include <pwd.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "sandbox.h"
+
+static int switch_to_user(const char *user)
+{
+	struct passwd const *pw = NULL;
+	if(getuid() != 0) {
+		return 1;
+	}
+	pw = getpwnam(user);
+	if(pw == NULL) {
+		return errno;
+	}
+	if(setgid(pw->pw_gid) != 0) {
+		return errno;
+	}
+	if(setgroups(0, NULL)) {
+		return errno;
+	}
+	if(setuid(pw->pw_uid) != 0) {
+		return errno;
+	}
+	return 0;
+}
+
+/* check exported library symbols with: nm -C -D <lib> */
+#define SYMEXPORT __attribute__((visibility("default")))
+
+int SYMEXPORT alpm_sandbox_child(const char* sandboxuser)
+{
+	int result = 0;
+
+	if(sandboxuser != NULL) {
+		result = switch_to_user(sandboxuser);
+	}
+
+	return result;
+}
diff --git lib/libalpm/sandbox.h lib/libalpm/sandbox.h
new file mode 100644
index 00000000..47611575
--- /dev/null
+++ lib/libalpm/sandbox.h
@@ -0,0 +1,31 @@ 
+/*
+ *  sandbox.h
+ *
+ *  Copyright (c) 2021 Pacman Development Team <pacman-dev@archlinux.org>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef ALPM_SANDBOX_H
+#define ALPM_SANDBOX_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int alpm_sandbox_child(const char *sandboxuser);
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* ALPM_SANDBOX_H */
diff --git src/pacman/conf.c src/pacman/conf.c
index 12fee64c..15d05e8c 100644
--- src/pacman/conf.c
+++ src/pacman/conf.c
@@ -33,6 +33,8 @@ 
 #include <unistd.h>
 #include <signal.h>
 
+#include <sandbox.h>
+
 /* pacman */
 #include "conf.h"
 #include "ini.h"
@@ -215,7 +217,7 @@  static char *get_tempfile(const char *path, const char *filename)
  * - not thread-safe
  * - errno may be set by fork(), pipe(), or execvp()
  */
-static int systemvp(const char *file, char *const argv[])
+static int systemvp(const char *file, char *const argv[], const char *sandboxuser)
 {
 	int pid, err = 0, ret = -1, err_fd[2];
 	sigset_t oldblock;
@@ -242,6 +244,13 @@  static int systemvp(const char *file, char *const argv[])
 		sigaction(SIGQUIT, &oldquit, NULL);
 		sigprocmask(SIG_SETMASK, &oldblock, NULL);
 
+		if (sandboxuser) {
+			ret = alpm_sandbox_child(sandboxuser);
+			if (ret != 0) {
+				pm_printf(ALPM_LOG_WARNING, _("Switching to sandbox user %s failed!\n"), sandboxuser);
+			}
+		}
+
 		execvp(file, argv);
 
 		/* execvp failed, pass the error back to the parent */
@@ -352,7 +361,7 @@  static int download_with_xfercommand(void *ctx, const char *url,
 			free(cmd);
 		}
 	}
-	retval = systemvp(argv[0], (char**)argv);
+	retval = systemvp(argv[0], (char**)argv, config->sandboxuser);
 
 	if(retval == -1) {
 		pm_printf(ALPM_LOG_WARNING, _("running XferCommand: fork failed!\n"));
@@ -668,6 +677,11 @@  static int _parse_options(const char *key, char *value,
 				config->logfile = strdup(value);
 				pm_printf(ALPM_LOG_DEBUG, "config: logfile: %s\n", value);
 			}
+		} else if(strcmp(key, "SandboxUser") == 0) {
+			if(!config->sandboxuser) {
+				config->sandboxuser = strdup(value);
+				pm_printf(ALPM_LOG_DEBUG, "config: sandboxuser: %s\n", value);
+			}
 		} else if(strcmp(key, "XferCommand") == 0) {
 			char **c;
 			if((config->xfercommand_argv = wordsplit(value)) == NULL) {
@@ -904,6 +918,7 @@  static int setup_libalpm(void)
 	alpm_option_set_architectures(handle, config->architectures);
 	alpm_option_set_checkspace(handle, config->checkspace);
 	alpm_option_set_usesyslog(handle, config->usesyslog);
+	alpm_option_set_sandboxuser(handle, config->sandboxuser);
 
 	alpm_option_set_ignorepkgs(handle, config->ignorepkg);
 	alpm_option_set_ignoregroups(handle, config->ignoregrp);
diff --git src/pacman/conf.h src/pacman/conf.h
index 04350d39..675e06fb 100644
--- src/pacman/conf.h
+++ src/pacman/conf.h
@@ -67,6 +67,7 @@  typedef struct __config_t {
 	char *logfile;
 	char *gpgdir;
 	char *sysroot;
+	char *sandboxuser;
 	alpm_list_t *hookdirs;
 	alpm_list_t *cachedirs;
 	alpm_list_t *architectures;
diff --git src/pacman/pacman-conf.c src/pacman/pacman-conf.c
index 600f1622..05a6bcd8 100644
--- src/pacman/pacman-conf.c
+++ src/pacman/pacman-conf.c
@@ -251,6 +251,7 @@  static void dump_config(void)
 	show_list_str("HookDir", config->hookdirs);
 	show_str("GPGDir", config->gpgdir);
 	show_str("LogFile", config->logfile);
+	show_str("SandboxUser", config->sandboxuser);
 
 	show_list_str("HoldPkg", config->holdpkg);
 	show_list_str("IgnorePkg", config->ignorepkg);
@@ -349,6 +350,8 @@  static int list_directives(void)
 			show_str("GPGDir", config->gpgdir);
 		} else if(strcasecmp(i->data, "LogFile") == 0) {
 			show_str("LogFile", config->logfile);
+		} else if(strcasecmp(i->data, "SandboxUser") == 0) {
+			show_str("SandboxUser", config->sandboxuser);
 
 		} else if(strcasecmp(i->data, "HoldPkg") == 0) {
 			show_list_str("HoldPkg", config->holdpkg);