[pacman-dev] libalpm: Give -U downloads a random .part name if needed

Message ID 20210903120640.462593-1-allan@archlinux.org
State New
Headers show
Series [pacman-dev] libalpm: Give -U downloads a random .part name if needed | expand

Commit Message

Allan McRae Sept. 3, 2021, 12:06 p.m. UTC
From: morganamilo <morganamilo@archlinux.org>

archweb's download links all ended in /download. This cause all the temp
files to be named download.part. With parallel downloads this results in
multiple downloads to go to the same temp file and breaks the transaction.

Assign random temporary filenames to downloads from URLs that are either
missing a filename, or if the filename does not contain at least three
hyphens (as a well formed package filename does).

While this approach to determining when to use a temporary filename is
not 100% foolproof, it does keep nice looking download progress bar names
when a proper package filename is given. The only downside of not using
temporary files when provided with a filename  with three or more hyphens
is URLs created specifically to bypass temporary filename usage can not
be downloaded in parallel. We probably do not want to download packages
from such URLs anyway.

Fixes FS#71464

Modified-by: Allan McRae (do not use temporary files for realish URLs)
Signed-off-by: Allan McRae <allan@archlinux.org>
---
 lib/libalpm/dload.c | 26 ++++++++++++++++++++++----
 lib/libalpm/dload.h |  1 +
 2 files changed, 23 insertions(+), 4 deletions(-)

Comments

Allan McRae Sept. 4, 2021, 12:33 a.m. UTC | #1
On 3/9/21 10:06 pm, Allan McRae wrote:
> From: morganamilo <morganamilo@archlinux.org>
> 
> archweb's download links all ended in /download. This cause all the temp
> files to be named download.part. With parallel downloads this results in
> multiple downloads to go to the same temp file and breaks the transaction.
> 
> Assign random temporary filenames to downloads from URLs that are either
> missing a filename, or if the filename does not contain at least three
> hyphens (as a well formed package filename does).
> 
> While this approach to determining when to use a temporary filename is
> not 100% foolproof, it does keep nice looking download progress bar names
> when a proper package filename is given. The only downside of not using
> temporary files when provided with a filename  with three or more hyphens
> is URLs created specifically to bypass temporary filename usage can not
> be downloaded in parallel. We probably do not want to download packages
> from such URLs anyway.
> 
> Fixes FS#71464
> 

<snip>

		STRDUP(payload->fileurl, url, FREE(payload); GOTO_ERR(handle,
ALPM_ERR_MEMORY, err));
> -			payload->allow_resume = 1;
> +
> +			c = strrchr(url, '/');
> +			while(*c) {
> +				if(*c == '-') {
> +					hyphen_count++;
> +				}
> +				c++;
> +			}
> +
> +			if(hyphen_count > 2) {


Changed all this to:

c = strrchr(url, '/');
if(strstr(c, ".pkg")) {
	/* we probably have a usable package filename to download to */

which is consistent with the hacky crap we do URL redirections...

Patch

diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c
index ca6be7b6..d2448349 100644
--- a/lib/libalpm/dload.c
+++ b/lib/libalpm/dload.c
@@ -767,7 +767,7 @@  static int curl_add_payload(alpm_handle_t *handle, CURLM *curlm,
 		GOTO_ERR(handle, ALPM_ERR_SERVER_BAD_URL, cleanup);
 	}
 
-	if(payload->remote_name && strlen(payload->remote_name) > 0) {
+	if(!payload->random_partfile && payload->remote_name && strlen(payload->remote_name) > 0) {
 		if(!payload->destfile_name) {
 			payload->destfile_name = get_fullpath(localpath, payload->remote_name, "");
 		}
@@ -776,8 +776,9 @@  static int curl_add_payload(alpm_handle_t *handle, CURLM *curlm,
 			goto cleanup;
 		}
 	} else {
-		/* URL doesn't contain a filename, so make a tempfile. We can't support
-		 * resuming this kind of download; partial transfers will be destroyed */
+		/* We want a random filename or the URL does not contain a filename, so download to a
+                 * temporary location. We can not support resuming this kind of download; any partial
+                 * transfers will be destroyed */
 		payload->unlink_on_fail = 1;
 
 		payload->localf = create_tempfile(payload, localpath);
@@ -986,11 +987,28 @@  int SYMEXPORT alpm_fetch_pkgurl(alpm_handle_t *handle, const alpm_list_t *urls,
 			alpm_list_append(fetched, filepath);
 		} else {
 			struct dload_payload *payload = NULL;
+			size_t hyphen_count = 0;
+			char *c;
 
 			ASSERT(url, GOTO_ERR(handle, ALPM_ERR_WRONG_ARGS, err));
 			CALLOC(payload, 1, sizeof(*payload), GOTO_ERR(handle, ALPM_ERR_MEMORY, err));
 			STRDUP(payload->fileurl, url, FREE(payload); GOTO_ERR(handle, ALPM_ERR_MEMORY, err));
-			payload->allow_resume = 1;
+
+			c = strrchr(url, '/');
+			while(*c) {
+				if(*c == '-') {
+					hyphen_count++;
+				}
+				c++;
+			}
+
+			if(hyphen_count > 2) {
+				/* we probably have a usable package filename to download to */
+				payload->allow_resume = 1;
+			} else {
+				payload->random_partfile = 1;
+			}
+
 			payload->handle = handle;
 			payload->trust_remote_name = 1;
 			payload->download_signature = (handle->siglevel & ALPM_SIG_PACKAGE);
diff --git a/lib/libalpm/dload.h b/lib/libalpm/dload.h
index 8f3d17b4..dfeb7902 100644
--- a/lib/libalpm/dload.h
+++ b/lib/libalpm/dload.h
@@ -44,6 +44,7 @@  struct dload_payload {
 	off_t prevprogress;
 	int force;
 	int allow_resume;
+	int random_partfile;
 	int errors_ok;
 	int unlink_on_fail;
 	int trust_remote_name;