Message ID | 20190108234037.20059-3-eschwartz@archlinux.org |
---|---|
State | Superseded, archived |
Headers | show |
Series | Replacing | expand |
On Tue, Jan 08, 2019 at 06:40:37PM -0500, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote: > diff --git a/db-archive b/db-archive > new file mode 100755 > index 00000000..5680b9de > --- /dev/null > +++ b/db-archive > @@ -0,0 +1,21 @@ > +#!/bin/bash > + > +. "$(dirname "$(readlink -e "$0")")/config" This uses $0 (see below). > + > +if (( $# != 1 )); then > + echo "usage: %s <pkgfile>" "${0##*/}" > + exit 1 > +fi > + > +if [[ -n ${ARCHIVEUSER} ]]; then > + exec sudo -u "${ARCHIVEUSER}" bash "${BASH_SOURCE[0]}" "${@}" This uses $BASH_SOURCE instead of $0 as used above. Is this intentional, if so why? I'd argue that this should also use $0, but maybe I'm missing something? > +fi > + > +pkgfile=${1##*/} > +pkgname=${pkgfile%-*-*-*} > +archive_dir="${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" > + > +if [[ ! -f ${archive_dir}/${pkgfile} ]]; then > + mkdir -p "${archive_dir}" > + cp -np "${1}"{,.sig} "${archive_dir}/" > +fi > diff --git a/db-functions b/db-functions > index 7aeedced..b8a00b90 100644 > --- a/db-functions > +++ b/db-functions > @@ -444,4 +447,24 @@ arch_repo_modify() { > REPO_MODIFIED=1 > } > > +# Verify the existence of dependent packages needed by a given pkgfile > +# usage: check_reproducible pkgfile > +check_reproducible() { > + local pkg dir pkgs=() pkgfile pkgfiles=() > + > + mapfile -t pkgs < <(_grep_all_info "${1}" .BUILDINFO installed) > + > + for pkg in "${pkgs[@]}"; do > + local pkgname=${pkg%-*-*-*} > + for dir in "${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" "${STAGING}"/**/; do > + if pkgfile="$(getpkgfile "${dir}/${pkg}"${PKGEXTS} 2>/dev/null)"; then > + pkgfiles+=("${pkgfile}") > + continue 2 > + fi > + done > + error "could not find existing package for %s" "${pkg}" I imagine that I'd be confused if I ever saw this error. How about clarifying it like this? "could not find package for dependency %s in reproducibility archive or your staging directory" > + return 1 > + done > +} > + > . "$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")/db-functions-${VCS}" > diff --git a/db-update b/db-update > index 313fb999..04a29bf3 100755 > --- a/db-update > +++ b/db-update > @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do > if ! check_builddir "${pkg}"; then > die "Package %s was not built in a chroot" "$repo/${pkg##*/}" > fi > + if ! check_reproducible "${pkg}"; then > + die "Package %s is not reproducible" "${pkg}" Same as above. I'd suggest something like this: "Package %s depends on packages that are missing in the reproducibility archive and your staging directory. Ensure that all dependencies either exist in the repositories or reproducibility archive already or that they are added together with the package in a single call to db-update." Florian
On 1/9/19 8:00 AM, Florian Pritz wrote: > On Tue, Jan 08, 2019 at 06:40:37PM -0500, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote: >> diff --git a/db-archive b/db-archive >> new file mode 100755 >> index 00000000..5680b9de >> --- /dev/null >> +++ b/db-archive >> @@ -0,0 +1,21 @@ >> +#!/bin/bash >> + >> +. "$(dirname "$(readlink -e "$0")")/config" > > This uses $0 (see below). > >> + >> +if (( $# != 1 )); then >> + echo "usage: %s <pkgfile>" "${0##*/}" >> + exit 1 >> +fi >> + >> +if [[ -n ${ARCHIVEUSER} ]]; then >> + exec sudo -u "${ARCHIVEUSER}" bash "${BASH_SOURCE[0]}" "${@}" > > This uses $BASH_SOURCE instead of $0 as used above. Is this intentional, > if so why? I'd argue that this should also use $0, but maybe I'm missing > something? BASH_SOURCE explicitly refers to the file it was sourced/executed from, it's like the __file__ macro other languages have. $0 can be anything since programs can modify their argv0 freely, but by default it is the toplevel script name, which is why it gets used so much. It's still not technically correct, and I prefer to use BASH_SOURCE. In short, this is a copy-paste error above. >> +fi >> + >> +pkgfile=${1##*/} >> +pkgname=${pkgfile%-*-*-*} >> +archive_dir="${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" >> + >> +if [[ ! -f ${archive_dir}/${pkgfile} ]]; then >> + mkdir -p "${archive_dir}" >> + cp -np "${1}"{,.sig} "${archive_dir}/" >> +fi >> diff --git a/db-functions b/db-functions >> index 7aeedced..b8a00b90 100644 >> --- a/db-functions >> +++ b/db-functions >> @@ -444,4 +447,24 @@ arch_repo_modify() { >> REPO_MODIFIED=1 >> } >> >> +# Verify the existence of dependent packages needed by a given pkgfile >> +# usage: check_reproducible pkgfile >> +check_reproducible() { >> + local pkg dir pkgs=() pkgfile pkgfiles=() >> + >> + mapfile -t pkgs < <(_grep_all_info "${1}" .BUILDINFO installed) >> + >> + for pkg in "${pkgs[@]}"; do >> + local pkgname=${pkg%-*-*-*} >> + for dir in "${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" "${STAGING}"/**/; do >> + if pkgfile="$(getpkgfile "${dir}/${pkg}"${PKGEXTS} 2>/dev/null)"; then >> + pkgfiles+=("${pkgfile}") >> + continue 2 >> + fi >> + done >> + error "could not find existing package for %s" "${pkg}" > > > I imagine that I'd be confused if I ever saw this error. How about > clarifying it like this? "could not find package for dependency %s in > reproducibility archive or your staging directory" Maybe "existing or staged package for dependency %s"? >> + return 1 >> + done >> +} >> + >> . "$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")/db-functions-${VCS}" >> diff --git a/db-update b/db-update >> index 313fb999..04a29bf3 100755 >> --- a/db-update >> +++ b/db-update >> @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do >> if ! check_builddir "${pkg}"; then >> die "Package %s was not built in a chroot" "$repo/${pkg##*/}" >> fi >> + if ! check_reproducible "${pkg}"; then >> + die "Package %s is not reproducible" "${pkg}" > > Same as above. I'd suggest something like this: > > "Package %s depends on packages that are missing in the reproducibility > archive and your staging directory. Ensure that all dependencies either > exist in the repositories or reproducibility archive already or that > they are added together with the package in a single call to db-update." The two errors will only be called together. I think expanding the message when printing the missing dependency should be enough.
On Wed, Jan 09, 2019 at 09:49:26AM -0500, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote: > >> diff --git a/db-functions b/db-functions > >> index 7aeedced..b8a00b90 100644 > >> --- a/db-functions > >> +++ b/db-functions > >> @@ -444,4 +447,24 @@ arch_repo_modify() { > >> REPO_MODIFIED=1 > >> } > >> > >> +# Verify the existence of dependent packages needed by a given pkgfile > >> +# usage: check_reproducible pkgfile > >> +check_reproducible() { > >> + local pkg dir pkgs=() pkgfile pkgfiles=() > >> + > >> + mapfile -t pkgs < <(_grep_all_info "${1}" .BUILDINFO installed) > >> + > >> + for pkg in "${pkgs[@]}"; do > >> + local pkgname=${pkg%-*-*-*} > >> + for dir in "${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" "${STAGING}"/**/; do > >> + if pkgfile="$(getpkgfile "${dir}/${pkg}"${PKGEXTS} 2>/dev/null)"; then > >> + pkgfiles+=("${pkgfile}") > >> + continue 2 > >> + fi > >> + done > >> + error "could not find existing package for %s" "${pkg}" > > > > > > I imagine that I'd be confused if I ever saw this error. How about > > clarifying it like this? "could not find package for dependency %s in > > reproducibility archive or your staging directory" > > Maybe "existing or staged package for dependency %s"? "could not find existing or staged package for dependency %s" is fine by me. > > >> + return 1 > >> + done > >> +} > >> + > >> . "$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")/db-functions-${VCS}" > >> diff --git a/db-update b/db-update > >> index 313fb999..04a29bf3 100755 > >> --- a/db-update > >> +++ b/db-update > >> @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do > >> if ! check_builddir "${pkg}"; then > >> die "Package %s was not built in a chroot" "$repo/${pkg##*/}" > >> fi > >> + if ! check_reproducible "${pkg}"; then > >> + die "Package %s is not reproducible" "${pkg}" > > > > Same as above. I'd suggest something like this: > > > > "Package %s depends on packages that are missing in the reproducibility > > archive and your staging directory. Ensure that all dependencies either > > exist in the repositories or reproducibility archive already or that > > they are added together with the package in a single call to db-update." > > The two errors will only be called together. I think expanding the > message when printing the missing dependency should be enough. I get that, but I think that a user that sees these two message may not understand that a missing dependency is related to the package being reproducible. To be honest, I actually expected db-update to run all checks and show all errors at once instead of terminating after the first one. I now know that this is not the case. I'm pretty sure that I would have treat these two messages as separate errors and that I'd then be confused as to what the "second error" is actually about. I think that it may save a lot of time and confusion if this error message is clear about what is wrong and how it can be fixed. Most of the other error messages in db-update are rather clear about the actual problem. Maybe not as clear as the message I propose here, but clearer than "Package %s is not reproducible". Apart from that, does it really hurt to have a more verbose error message? It will only be shown if there is an actual error and it doesn't influence normal usage. I'd say we can afford to be more verbose in that case. If you still think that this message should not be made more verbose, I'd argue that it should be removed entirely. If we have just the message about a dependency not being found, it is quite clear to a user what is wrong and how they could fix the error. I'd say that is much less confusing than if there were a second message about reproducibility that some people may or may not consider to be a different, additional error as I've explained above. Florian
On 1/9/19 10:34 AM, Florian Pritz wrote: > On Wed, Jan 09, 2019 at 09:49:26AM -0500, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote: >>>> diff --git a/db-functions b/db-functions >>>> index 7aeedced..b8a00b90 100644 >>>> --- a/db-functions >>>> +++ b/db-functions >>>> @@ -444,4 +447,24 @@ arch_repo_modify() { >>>> REPO_MODIFIED=1 >>>> } >>>> >>>> +# Verify the existence of dependent packages needed by a given pkgfile >>>> +# usage: check_reproducible pkgfile >>>> +check_reproducible() { >>>> + local pkg dir pkgs=() pkgfile pkgfiles=() >>>> + >>>> + mapfile -t pkgs < <(_grep_all_info "${1}" .BUILDINFO installed) >>>> + >>>> + for pkg in "${pkgs[@]}"; do >>>> + local pkgname=${pkg%-*-*-*} >>>> + for dir in "${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" "${STAGING}"/**/; do >>>> + if pkgfile="$(getpkgfile "${dir}/${pkg}"${PKGEXTS} 2>/dev/null)"; then >>>> + pkgfiles+=("${pkgfile}") >>>> + continue 2 >>>> + fi >>>> + done >>>> + error "could not find existing package for %s" "${pkg}" >>> >>> >>> I imagine that I'd be confused if I ever saw this error. How about >>> clarifying it like this? "could not find package for dependency %s in >>> reproducibility archive or your staging directory" >> >> Maybe "existing or staged package for dependency %s"? > > "could not find existing or staged package for dependency %s" is fine by > me. > >> >>>> + return 1 >>>> + done >>>> +} >>>> + >>>> . "$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")/db-functions-${VCS}" >>>> diff --git a/db-update b/db-update >>>> index 313fb999..04a29bf3 100755 >>>> --- a/db-update >>>> +++ b/db-update >>>> @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do >>>> if ! check_builddir "${pkg}"; then >>>> die "Package %s was not built in a chroot" "$repo/${pkg##*/}" >>>> fi >>>> + if ! check_reproducible "${pkg}"; then >>>> + die "Package %s is not reproducible" "${pkg}" >>> >>> Same as above. I'd suggest something like this: >>> >>> "Package %s depends on packages that are missing in the reproducibility >>> archive and your staging directory. Ensure that all dependencies either >>> exist in the repositories or reproducibility archive already or that >>> they are added together with the package in a single call to db-update." >> >> The two errors will only be called together. I think expanding the >> message when printing the missing dependency should be enough. > > I get that, but I think that a user that sees these two message may not > understand that a missing dependency is related to the package being > reproducible. To be honest, I actually expected db-update to run all > checks and show all errors at once instead of terminating after the > first one. I now know that this is not the case. I'm pretty sure that I > would have treat these two messages as separate errors and that I'd then > be confused as to what the "second error" is actually about. > > I think that it may save a lot of time and confusion if this error > message is clear about what is wrong and how it can be fixed. Most of > the other error messages in db-update are rather clear about the actual > problem. Maybe not as clear as the message I propose here, but clearer > than "Package %s is not reproducible". Apart from that, does it really > hurt to have a more verbose error message? It will only be shown if > there is an actual error and it doesn't influence normal usage. I'd say > we can afford to be more verbose in that case. > > If you still think that this message should not be made more verbose, > I'd argue that it should be removed entirely. If we have just the > message about a dependency not being found, it is quite clear to a user > what is wrong and how they could fix the error. I'd say that is much > less confusing than if there were a second message about reproducibility > that some people may or may not consider to be a different, additional > error as I've explained above. Well, db-update check doesn't necessarily know what check_reproducible() cannot find, I guess we could exit directly in there, but... Hmm, what about: Package %s is not reproducible. Ensure that all dependencies are available in the repositories or are added in the same db-update.
On Wed, Jan 09, 2019 at 10:45:12AM -0500, Eli Schwartz via arch-projects <arch-projects@archlinux.org> wrote: > >>>> diff --git a/db-update b/db-update > >>>> index 313fb999..04a29bf3 100755 > >>>> --- a/db-update > >>>> +++ b/db-update > >>>> @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do > >>>> if ! check_builddir "${pkg}"; then > >>>> die "Package %s was not built in a chroot" "$repo/${pkg##*/}" > >>>> fi > >>>> + if ! check_reproducible "${pkg}"; then > >>>> + die "Package %s is not reproducible" "${pkg}" > >>> > >>> Same as above. I'd suggest something like this: > >>> > >>> "Package %s depends on packages that are missing in the reproducibility > >>> archive and your staging directory. Ensure that all dependencies either > >>> exist in the repositories or reproducibility archive already or that > >>> they are added together with the package in a single call to db-update." > > Hmm, what about: > > Package %s is not reproducible. Ensure that all dependencies are > available in the repositories or are added in the same db-update. Fine by me. Florian
diff --git a/config b/config index 1cfe11f4..57a2cc47 100644 --- a/config +++ b/config @@ -1,6 +1,8 @@ #!/hint/bash FTP_BASE="/srv/ftp" +ARCHIVE_BASE="/srv/archive" +ARCHIVEUSER='archive' PKGREPOS=() PKGPOOL='' SRCPOOL='' diff --git a/db-archive b/db-archive new file mode 100755 index 00000000..5680b9de --- /dev/null +++ b/db-archive @@ -0,0 +1,21 @@ +#!/bin/bash + +. "$(dirname "$(readlink -e "$0")")/config" + +if (( $# != 1 )); then + echo "usage: %s <pkgfile>" "${0##*/}" + exit 1 +fi + +if [[ -n ${ARCHIVEUSER} ]]; then + exec sudo -u "${ARCHIVEUSER}" bash "${BASH_SOURCE[0]}" "${@}" +fi + +pkgfile=${1##*/} +pkgname=${pkgfile%-*-*-*} +archive_dir="${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" + +if [[ ! -f ${archive_dir}/${pkgfile} ]]; then + mkdir -p "${archive_dir}" + cp -np "${1}"{,.sig} "${archive_dir}/" +fi diff --git a/db-functions b/db-functions index 7aeedced..b8a00b90 100644 --- a/db-functions +++ b/db-functions @@ -165,20 +165,23 @@ repo_unlock () { #repo_unlock <repo-name> <arch> fi } +# usage: _grep_all_info pkgfile infofile key +_grep_all_info() { + local _ret=() + + mapfile -t _ret < <(/usr/bin/bsdtar -xOqf "$1" "${2}" | grep "^${3} = ") + + printf '%s\n' "${_ret[@]#${3} = }" +} + # usage: _grep_pkginfo pkgfile pattern _grep_pkginfo() { - local _ret - - _ret="$(/usr/bin/bsdtar -xOqf "$1" .PKGINFO | grep "^${2} = " | tail -1)" - echo "${_ret#${2} = }" + _grep_all_info "${1}" .PKGINFO "${2}" | tail -1 } # usage: _grep_buildinfo pkgfile pattern _grep_buildinfo() { - local _ret - - _ret="$(/usr/bin/bsdtar -xOqf "$1" .BUILDINFO | grep "^${2} = " | tail -1)" - echo "${_ret#${2} = }" + _grep_all_info "${1}" .BUILDINFO "${2}" | tail -1 } # Get the package base or name as fallback @@ -444,4 +447,24 @@ arch_repo_modify() { REPO_MODIFIED=1 } +# Verify the existence of dependent packages needed by a given pkgfile +# usage: check_reproducible pkgfile +check_reproducible() { + local pkg dir pkgs=() pkgfile pkgfiles=() + + mapfile -t pkgs < <(_grep_all_info "${1}" .BUILDINFO installed) + + for pkg in "${pkgs[@]}"; do + local pkgname=${pkg%-*-*-*} + for dir in "${ARCHIVE_BASE}/packages/${pkgname:0:1}/${pkgname}" "${STAGING}"/**/; do + if pkgfile="$(getpkgfile "${dir}/${pkg}"${PKGEXTS} 2>/dev/null)"; then + pkgfiles+=("${pkgfile}") + continue 2 + fi + done + error "could not find existing package for %s" "${pkg}" + return 1 + done +} + . "$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")/db-functions-${VCS}" diff --git a/db-update b/db-update index 313fb999..04a29bf3 100755 --- a/db-update +++ b/db-update @@ -61,6 +61,9 @@ for repo in "${repos[@]}"; do if ! check_builddir "${pkg}"; then die "Package %s was not built in a chroot" "$repo/${pkg##*/}" fi + if ! check_reproducible "${pkg}"; then + die "Package %s is not reproducible" "${pkg}" + fi done if ! check_splitpkgs "${repo}" "${pkgs[@]}"; then die "Missing split packages for %s" "$repo" @@ -82,6 +85,7 @@ for repo in "${repos[@]}"; do # any packages might have been moved by the previous run if [[ -f ${pkg} ]]; then mv "${pkg}" "$FTP_BASE/${PKGPOOL}" + "$(dirname "$(readlink -e "$0")")/db-archive" "${FTP_BASE}/${PKGPOOL}/${pkg##*/}" fi ln -s "../../../${PKGPOOL}/${pkgfile}" "$FTP_BASE/$repo/os/${pkgarch}" # also move signatures diff --git a/test/cases/db-update.bats b/test/cases/db-update.bats index 9ee06321..bc978302 100644 --- a/test/cases/db-update.bats +++ b/test/cases/db-update.bats @@ -87,6 +87,12 @@ load ../lib/common checkPackage testing pkg-any-a 1-2 } +@test "archive package when releasing" { + releasePackage extra pkg-any-a + db-update + [[ -f ${ARCHIVE_BASE}/packages/p/pkg-any-a/pkg-any-a-1-1-any${PKGEXT} ]] +} + @test "update same any package to same repository fails" { releasePackage extra pkg-any-a db-update
Whenever adding new package files to the pool of distributed packages, copy the file into a longterm archive. This is the first step to merging the functionality of archivetools, as this implements the shared pool while also guaranteeing that all packages are archived at the time of entry rather than once per day if they still exist. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> --- config | 2 ++ db-archive | 21 +++++++++++++++++++++ db-functions | 39 +++++++++++++++++++++++++++++++-------- db-update | 4 ++++ test/cases/db-update.bats | 6 ++++++ 5 files changed, 64 insertions(+), 8 deletions(-) create mode 100755 db-archive