Message ID | 20180215194946.20820-2-lukeshu@lukeshu.com |
---|---|
State | Rejected |
Headers | show |
Series | Better handling of PKGEXT | expand |
On Thu, Feb 15, 2018 at 02:49:45PM -0500, Luke Shumaker wrote: > From: Luke Shumaker <lukeshu@parabola.nu> > > This partially reverts commit b61a7148eaf546a31597b74d9dd8948e4a39dea1. > > In dbscripts, PKGEXT is a glob pattern--it needs to be "unquoted"; and > The magic-quoting done by `[[ ... ]]` breaks that. The closing-quote > coming before ${PKGEXT} was quite intentional. > --- > db-functions | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/db-functions b/db-functions > index e8949d7..93a5af3 100644 > --- a/db-functions > +++ b/db-functions > @@ -374,8 +374,8 @@ check_pkgrepos() { > local pkgver="$(getpkgver ${pkgfile})" || return 1 > local pkgarch="$(getpkgarch ${pkgfile})" || return 1 > > - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT} ]] && return 1 > - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT}.sig ]] && return 1 > + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} ] && return 1 > + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT}.sig ] && return 1 Rather than making this stand out like a sore thumb for the next person to trip over, why don't we just define a "file_exists" function? file_exists() { [[ -f $1 ]] } Now you're free to do this: file_exists "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} && return 1 Expansion is taken care of by the shell before you pass to exists, and the check does what you'd expect. > [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 > [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1 > > -- > 2.16.1
On 02/15/2018 03:11 PM, Dave Reisner wrote: > Rather than making this stand out like a sore thumb for the next person > to trip over, why don't we just define a "file_exists" function? > > file_exists() { > [[ -f $1 ]] > } > > Now you're free to do this: > > file_exists "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} && return 1 > > Expansion is taken care of by the shell before you pass to exists, and > the check does what you'd expect. That was my first inclination, my second was to stop allowing people to upload *.kz compressed packages for giggles. If we absolutely need to glob anything there, we should use bash extended globs to match @(g|z) and whatever else we actually want. This would work in [[ ]]
On Thu, Feb 15, 2018 at 03:14:19PM -0500, Eli Schwartz via arch-projects wrote: > On 02/15/2018 03:11 PM, Dave Reisner wrote: > > Rather than making this stand out like a sore thumb for the next person > > to trip over, why don't we just define a "file_exists" function? > > > > file_exists() { > > [[ -f $1 ]] > > } > > > > Now you're free to do this: > > > > file_exists "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} && return 1 > > > > Expansion is taken care of by the shell before you pass to exists, and > > the check does what you'd expect. > > That was my first inclination, my second was to stop allowing people to > upload *.kz compressed packages for giggles. > > If we absolutely need to glob anything there, we should use bash > extended globs to match @(g|z) and whatever else we actually want. This > would work in [[ ]] > Nope, changing the kind of glob doesn't work here. There's simply no glob expansion of any kind inside [[ -f ]] (or any other stat-like check).
On 02/15/2018 03:48 PM, Dave Reisner wrote: > Nope, changing the kind of glob doesn't work here. There's simply no > glob expansion of any kind inside [[ -f ]] (or any other stat-like > check). I was thinking maybe something like the way makepkg compares filenames to various extended globs, but actually that doesn't help since [[ = ]] will only expand the right side... But if we do globbing at all, I want to reliably enforce actual limits on what .pkg.tar.?z is meant to convey. ... find(1) doesn't support the FNM_EXTGLOB flag from fnmatch(3) and we use PKGEXT as a pattern a couple times for that. We could replace `find -name` altogether, using globstar.
On Thu, 15 Feb 2018 15:11:13 -0500, Dave Reisner wrote: > > On Thu, Feb 15, 2018 at 02:49:45PM -0500, Luke Shumaker wrote: > > - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT} ]] && return 1 > > - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT}.sig ]] && return 1 > > + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} ] && return 1 > > + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT}.sig ] && return 1 > > Rather than making this stand out like a sore thumb for the next person > to trip over, why don't we just define a "file_exists" function? > > file_exists() { > [[ -f $1 ]] > } > > Now you're free to do this: > > file_exists "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} && return 1 > > Expansion is taken care of by the shell before you pass to exists, and > the check does what you'd expect. That's not a bad idea. But then someone reading the code might wonder "why does such a trivial function exist?". I think it would be silly, and ultimately hurt readability to go through and replace all "[[ -f ... ]]" instances with "file_exists", and if we don't do that, then there's a weird magic question of "when should I use [[ -f ]] and when should I used file_exists?" Ultimately, this way doesn't hide anything, and has everything the next person needs to know right there. Sure, they'll have to be careful not to trip. The next patch I sent changes it to PKGEXT_glob, to make it stand out a little more. I'm also working on a `make lint`/shellcheck patchset that will add `# shellcheck disable=SC2086` directives to each line to avoid shellcheck complaining about PKGEXT_glob being unquoted, drawing even more attention to it, to avoid tripping.
On 02/15/2018 04:43 PM, Luke Shumaker wrote: > That's not a bad idea. But then someone reading the code might wonder > "why does such a trivial function exist?". I think it would be silly, > and ultimately hurt readability to go through and replace all > > "[[ -f ... ]]" instances with "file_exists", and if we don't do that, > then there's a weird magic question of "when should I use [[ -f ]] and > when should I used file_exists?" Why would it hurt readability? Why won't people be able to read the comments I wrote documenting the function in my working tree? > Ultimately, this way doesn't hide anything, and has everything the > next person needs to know right there. Sure, they'll have to be > careful not to trip. The next patch I sent changes it to PKGEXT_glob, > to make it stand out a little more. I'm also working on a `make > lint`/shellcheck patchset that will add `# shellcheck disable=SC2086` > directives to each line to avoid shellcheck complaining about > PKGEXT_glob being unquoted, drawing even more attention to it, to > avoid tripping. I want to make dbscripts more readable, not less. Just reverting every new thing when *both* are broken, is not something I want to do. I'd like to just remove this time bomb properly.
On Thu, 15 Feb 2018 16:57:26 -0500, Eli Schwartz wrote: > > On 02/15/2018 04:43 PM, Luke Shumaker wrote: > > That's not a bad idea. But then someone reading the code might wonder > > "why does such a trivial function exist?". I think it would be silly, > > and ultimately hurt readability to go through and replace all > > > > "[[ -f ... ]]" instances with "file_exists", and if we don't do that, > > then there's a weird magic question of "when should I use [[ -f ]] and > > when should I used file_exists?" > > Why would it hurt readability? > > Why won't people be able to read the comments I wrote documenting the > function in my working tree? Every Bash programmer knows what [[ -f ]] does. Sure - "file_exists" is self-explanatory, and - a comment there might better explain why it exists instead of using [[ -f ]] But then it's another silly little thing that they have to keep in mind everywhere in the codebase. What about just adding a comment: # We can't use [[ ]] for these 2 checks because glob expansion # needs to be performed on $PKGEXT. It avoids someone tripping over it in the future, and avoids adding another unnecessary abstraction to the codebase. > > Ultimately, this way doesn't hide anything, and has everything the > > next person needs to know right there. Sure, they'll have to be > > careful not to trip. The next patch I sent changes it to PKGEXT_glob, > > to make it stand out a little more. I'm also working on a `make > > lint`/shellcheck patchset that will add `# shellcheck disable=SC2086` > > directives to each line to avoid shellcheck complaining about > > PKGEXT_glob being unquoted, drawing even more attention to it, to > > avoid tripping. > > I want to make dbscripts more readable, not less. Just reverting every > new thing when *both* are broken, is not something I want to do. Huh? My version isn't broken. Unless you mean that the glob is more restrictive than the one used by makepkg?
On 02/15/2018 05:17 PM, Luke Shumaker wrote: > Huh? My version isn't broken. Unless you mean that the glob is more > restrictive than the one used by makepkg? As you saw my other message, this should be answered already, but consider this additional perspective on globs: The glob is both *less* restrictive and more restrictive, it accepts any valid unicode character. To be more exact, it's almost completely orthogonal to the one in makepkg. makepkg only accepts .tar.gz, .tar.bz2, .tar.xz, .tar.lzo, .tar.lrz, and .tar.Z and most of those fail to match against a two-char compression type. dbscripts accepts .pkg.tar.💩z which incidentally is what I think of cherry-picking xz and gz as supported methods. AFAIK there should not be any non-xz packages in the official repos, I can verify that there are currently none, and I'm not sure why we should support it anyway (but if we do, we should do it properly).
diff --git a/db-functions b/db-functions index e8949d7..93a5af3 100644 --- a/db-functions +++ b/db-functions @@ -374,8 +374,8 @@ check_pkgrepos() { local pkgver="$(getpkgver ${pkgfile})" || return 1 local pkgarch="$(getpkgarch ${pkgfile})" || return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT} ]] && return 1 - [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}${PKGEXT}.sig ]] && return 1 + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT} ] && return 1 + [ -f "${FTP_BASE}/${PKGPOOL}/${pkgname}-${pkgver}-${pkgarch}"${PKGEXT}.sig ] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/} ]] && return 1 [[ -f ${FTP_BASE}/${PKGPOOL}/${pkgfile##*/}.sig ]] && return 1
From: Luke Shumaker <lukeshu@parabola.nu> This partially reverts commit b61a7148eaf546a31597b74d9dd8948e4a39dea1. In dbscripts, PKGEXT is a glob pattern--it needs to be "unquoted"; and The magic-quoting done by `[[ ... ]]` breaks that. The closing-quote coming before ${PKGEXT} was quite intentional. --- db-functions | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)