Message ID | 20210812035914.39053-1-softwaresale01@gmail.com |
---|---|
State | Accepted, archived |
Headers | show |
Series | [pacman-dev] Order downloads by descending max_size | expand |
On Wed, Aug 11, 2021 at 11:59 PM Charlie Sale <softwaresale01@gmail.com> wrote: > > When downloading in parallel, sort by package size so that the larger > packages are queued first to fully leverage parallelism. > Addresses FS#70172 > > Signed-off-by: Charlie Sale <softwaresale01@gmail.com> > --- > lib/libalpm/dload.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c > index ca6be7b6..58d2122f 100644 > --- a/lib/libalpm/dload.c > +++ b/lib/libalpm/dload.c > @@ -825,6 +825,19 @@ cleanup: > return ret; > } > > +/* > + * Use to sort payloads by max size in decending order (largest -> smallest) > + */ > +static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr) > +{ > + struct dload_payload *left, *right; > + > + left = (struct dload_payload *) left_ptr; > + right = (struct dload_payload *) right_ptr; > + > + return right->max_size - left->max_size; > +} > + > /* Returns -1 if an error happened for a required file > * Returns 0 if a payload was actually downloaded > * Returns 1 if no files were downloaded and all errors were non-fatal > @@ -838,6 +851,10 @@ static int curl_download_internal(alpm_handle_t *handle, > int max_streams = handle->parallel_downloads; > int updated = 0; /* was a file actually updated */ > CURLM *curlm = handle->curlm; > + size_t payloads_size = alpm_list_count(payloads); > + > + /* Sort payloads by package size */ > + payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes); > > while(active_downloads_num > 0 || payloads) { > CURLMcode mc; > -- > 2.32.0 > I'll also add that this is my first contribution to pacman. Please inform me of anything I did incorrectly regarding the patch process. Thank you in advance for your patience :) Cheers! ~Charlie
On 12/08/2021 05:12, Charlie Sale wrote: > On Wed, Aug 11, 2021 at 11:59 PM Charlie Sale <softwaresale01@gmail.com> wrote: >> >> When downloading in parallel, sort by package size so that the larger >> packages are queued first to fully leverage parallelism. >> Addresses FS#70172 >> >> Signed-off-by: Charlie Sale <softwaresale01@gmail.com> >> --- >> lib/libalpm/dload.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c >> index ca6be7b6..58d2122f 100644 >> --- a/lib/libalpm/dload.c >> +++ b/lib/libalpm/dload.c >> @@ -825,6 +825,19 @@ cleanup: >> return ret; >> } >> >> +/* >> + * Use to sort payloads by max size in decending order (largest -> smallest) >> + */ >> +static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr) >> +{ >> + struct dload_payload *left, *right; >> + >> + left = (struct dload_payload *) left_ptr; >> + right = (struct dload_payload *) right_ptr; >> + >> + return right->max_size - left->max_size; >> +} >> + >> /* Returns -1 if an error happened for a required file >> * Returns 0 if a payload was actually downloaded >> * Returns 1 if no files were downloaded and all errors were non-fatal >> @@ -838,6 +851,10 @@ static int curl_download_internal(alpm_handle_t *handle, >> int max_streams = handle->parallel_downloads; >> int updated = 0; /* was a file actually updated */ >> CURLM *curlm = handle->curlm; >> + size_t payloads_size = alpm_list_count(payloads); >> + >> + /* Sort payloads by package size */ >> + payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes); >> >> while(active_downloads_num > 0 || payloads) { >> CURLMcode mc; >> -- >> 2.32.0 >> > > I'll also add that this is my first contribution to pacman. Please > inform me of anything I did incorrectly regarding the patch process. > Thank you in advance for your patience :) > > Cheers! > ~Charlie > In my totally untested theory this would slow things down. The ideal situation is to have 1 large download running to soak up bandwidth and then many small downloads running so they can all be set up and handshake. Have you got any benchmarks to comfirm/deny the ones in the bug report?
On 12/8/21 3:38 pm, Morgan Adamiec wrote: > > > On 12/08/2021 05:12, Charlie Sale wrote: >> On Wed, Aug 11, 2021 at 11:59 PM Charlie Sale <softwaresale01@gmail.com> wrote: >>> >>> When downloading in parallel, sort by package size so that the larger >>> packages are queued first to fully leverage parallelism. >>> Addresses FS#70172 >>> >>> Signed-off-by: Charlie Sale <softwaresale01@gmail.com> >>> --- >>> lib/libalpm/dload.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c >>> index ca6be7b6..58d2122f 100644 >>> --- a/lib/libalpm/dload.c >>> +++ b/lib/libalpm/dload.c >>> @@ -825,6 +825,19 @@ cleanup: >>> return ret; >>> } >>> >>> +/* >>> + * Use to sort payloads by max size in decending order (largest -> smallest) >>> + */ >>> +static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr) >>> +{ >>> + struct dload_payload *left, *right; >>> + >>> + left = (struct dload_payload *) left_ptr; >>> + right = (struct dload_payload *) right_ptr; >>> + >>> + return right->max_size - left->max_size; >>> +} >>> + >>> /* Returns -1 if an error happened for a required file >>> * Returns 0 if a payload was actually downloaded >>> * Returns 1 if no files were downloaded and all errors were non-fatal >>> @@ -838,6 +851,10 @@ static int curl_download_internal(alpm_handle_t *handle, >>> int max_streams = handle->parallel_downloads; >>> int updated = 0; /* was a file actually updated */ >>> CURLM *curlm = handle->curlm; >>> + size_t payloads_size = alpm_list_count(payloads); >>> + >>> + /* Sort payloads by package size */ >>> + payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes); >>> >>> while(active_downloads_num > 0 || payloads) { >>> CURLMcode mc; >>> -- >>> 2.32.0 >>> >> >> I'll also add that this is my first contribution to pacman. Please >> inform me of anything I did incorrectly regarding the patch process. >> Thank you in advance for your patience :) >> >> Cheers! >> ~Charlie >> > > In my totally untested theory this would slow things down. The ideal > situation is to have 1 large download running to soak up bandwidth and > then many small downloads running so they can all be set up and handshake. > > Have you got any benchmarks to comfirm/deny the ones in the bug report? I have done rough numbers... Compared to random order, there is a net gain. Not a big gain over the range of parameters I tried, but a gain. This sort is unlikely to optimal, but optimality would be much harder to achieve. Allan
diff --git a/lib/libalpm/dload.c b/lib/libalpm/dload.c index ca6be7b6..58d2122f 100644 --- a/lib/libalpm/dload.c +++ b/lib/libalpm/dload.c @@ -825,6 +825,19 @@ cleanup: return ret; } +/* + * Use to sort payloads by max size in decending order (largest -> smallest) + */ +static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr) +{ + struct dload_payload *left, *right; + + left = (struct dload_payload *) left_ptr; + right = (struct dload_payload *) right_ptr; + + return right->max_size - left->max_size; +} + /* Returns -1 if an error happened for a required file * Returns 0 if a payload was actually downloaded * Returns 1 if no files were downloaded and all errors were non-fatal @@ -838,6 +851,10 @@ static int curl_download_internal(alpm_handle_t *handle, int max_streams = handle->parallel_downloads; int updated = 0; /* was a file actually updated */ CURLM *curlm = handle->curlm; + size_t payloads_size = alpm_list_count(payloads); + + /* Sort payloads by package size */ + payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes); while(active_downloads_num > 0 || payloads) { CURLMcode mc;
When downloading in parallel, sort by package size so that the larger packages are queued first to fully leverage parallelism. Addresses FS#70172 Signed-off-by: Charlie Sale <softwaresale01@gmail.com> --- lib/libalpm/dload.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)