Message ID | 20191126212957.29031-1-e5ten.arch@gmail.com |
---|---|
State | Rejected, archived |
Headers | show |
Series | [pacman-dev,1/3] libmakepkg: use extraction commands instead of file to find archive type | expand |
On 11/26/19 4:29 PM, Ethan Sommer wrote: > Previously, to determine which command we should use to extract an > archive, we would run file and match the output against our list of > possible extraction commands > > Instead, run the archive through each extraction command's -t (--test) > flag, if this succeeds then we know that the command is able to extract > the file and is the one to use Missing rationale why we care. > scripts/libmakepkg/source/file.sh.in | 39 ++++++++-------------------- > 1 file changed, 11 insertions(+), 28 deletions(-) > > diff --git a/scripts/libmakepkg/source/file.sh.in b/scripts/libmakepkg/source/file.sh.in > index 7297a1c6..faace79b 100644 > --- a/scripts/libmakepkg/source/file.sh.in > +++ b/scripts/libmakepkg/source/file.sh.in > @@ -96,35 +96,18 @@ extract_file() { > fi > > # do not rely on extension for file type > - local file_type=$(@FILECMD@ -bizL -- "$file") > - local ext=${file##*.} > local cmd='' > - case "$file_type" in > - *application/x-tar*|*application/zip*|*application/x-zip*|*application/x-cpio*) > - cmd="bsdtar" ;; > - *application/x-gzip*|*application/gzip*) > - case "$ext" in > - gz|z|Z) cmd="gzip" ;; > - *) return;; > - esac ;; > - *application/x-bzip*) > - case "$ext" in > - bz2|bz) cmd="bzip2" ;; > - *) return;; > - esac ;; > - *application/x-xz*) > - case "$ext" in > - xz) cmd="xz" ;; > - *) return;; > - esac ;; > - *) > - # See if bsdtar can recognize the file > - if bsdtar -tf "$file" -q '*' &>/dev/null; then > - cmd="bsdtar" > - else > - return 0 > - fi ;; > - esac > + if bsdtar -tf "$file" -q '*'; then > + cmd='bsdtar' > + elif gzip -t "$file"; then > + cmd='gzip' > + elif bzip2 -t "$file"; then > + cmd='bzip2' > + elif xz -t "$file"; then > + cmd='xz' > + else > + return 0 > + fi &>/dev/null I had to look at this three times before I realized you were redirecting both stdout and stderr of the entire if/elif block. Because these commands check to see if the file is "uncorrupted", not to see if they are "formatted in this compression format", I imagine this would falsely fail files if they have some class of recoverable error, but I'm not sure if we care. What we do care about, I think, is that you're silencing stderr due to the fact that it will noisily print diagnostic messages when the file is not compressed or is in a different compression format, and bsdtar will print listed filenames to stdout. So now, if these commands fail for *any* reason, we treat this as a successful analysis that the file is in a different format. One reason for it failing might be that xz is not installed... this is hardly a successful analysis, and printing an "error: command not found" message during the decompression, then dying with "Failed to extract %s" is a desirable feature. > local ret=0 > msg2 "$(gettext "Extracting %s with %s")" "$file" "$cmd" >
On 27/11/19 7:29 am, Ethan Sommer wrote: > Previously, to determine which command we should use to extract an > archive, we would run file and match the output against our list of > possible extraction commands > > Instead, run the archive through each extraction command's -t (--test) > flag, if this succeeds then we know that the command is able to extract > the file and is the one to use > Apart from the issues Eli brought up, there is a large performance regression on large source files. $ time file -bizL libreoffice-6.3.4.1.tar.xz application/x-tar; charset=binary compressed-encoding=application/x-xz; charset=binary real 0m0.034s user 0m0.031s sys 0m0.004s $ time xz -t libreoffice-6.3.4.1.tar.xz real 0m11.970s user 0m11.898s sys 0m0.057s Allan
diff --git a/scripts/libmakepkg/source/file.sh.in b/scripts/libmakepkg/source/file.sh.in index 7297a1c6..faace79b 100644 --- a/scripts/libmakepkg/source/file.sh.in +++ b/scripts/libmakepkg/source/file.sh.in @@ -96,35 +96,18 @@ extract_file() { fi # do not rely on extension for file type - local file_type=$(@FILECMD@ -bizL -- "$file") - local ext=${file##*.} local cmd='' - case "$file_type" in - *application/x-tar*|*application/zip*|*application/x-zip*|*application/x-cpio*) - cmd="bsdtar" ;; - *application/x-gzip*|*application/gzip*) - case "$ext" in - gz|z|Z) cmd="gzip" ;; - *) return;; - esac ;; - *application/x-bzip*) - case "$ext" in - bz2|bz) cmd="bzip2" ;; - *) return;; - esac ;; - *application/x-xz*) - case "$ext" in - xz) cmd="xz" ;; - *) return;; - esac ;; - *) - # See if bsdtar can recognize the file - if bsdtar -tf "$file" -q '*' &>/dev/null; then - cmd="bsdtar" - else - return 0 - fi ;; - esac + if bsdtar -tf "$file" -q '*'; then + cmd='bsdtar' + elif gzip -t "$file"; then + cmd='gzip' + elif bzip2 -t "$file"; then + cmd='bzip2' + elif xz -t "$file"; then + cmd='xz' + else + return 0 + fi &>/dev/null local ret=0 msg2 "$(gettext "Extracting %s with %s")" "$file" "$cmd"
Previously, to determine which command we should use to extract an archive, we would run file and match the output against our list of possible extraction commands Instead, run the archive through each extraction command's -t (--test) flag, if this succeeds then we know that the command is able to extract the file and is the one to use Signed-off-by: Ethan Sommer <e5ten.arch@gmail.com> --- scripts/libmakepkg/source/file.sh.in | 39 ++++++++-------------------- 1 file changed, 11 insertions(+), 28 deletions(-)