[pacman-dev,1/1] alpm: use flock() for db lock

Message ID 20170606212628.26466-1-list@eworm.de
State Rejected, archived
Delegated to: Andrew Gregory
Headers show
Series [pacman-dev,1/1] alpm: use flock() for db lock | expand

Commit Message

Christian Hesse June 6, 2017, 9:26 p.m. UTC
From: Christian Hesse <mail@eworm.de>

We used to check for file existens, but that suffers from stale lock
files caused by unexpected events like crash, shutdown, etc.

Instead use flock() to lock the file. It does not matter whether or
not the file exists but whether an exclusive lock can be obtained.

Also remove the hint about removing the file from pacman.

Signed-off-by: Christian Hesse <mail@eworm.de>
---
 lib/libalpm/handle.c | 17 +++++++++++++++--
 src/pacman/util.c    |  5 -----
 2 files changed, 15 insertions(+), 7 deletions(-)

Comments

Andrew Gregory June 6, 2017, 9:56 p.m. UTC | #1
On 06/06/17 at 11:26pm, Christian Hesse wrote:
> From: Christian Hesse <mail@eworm.de>
> 
> We used to check for file existens, but that suffers from stale lock
> files caused by unexpected events like crash, shutdown, etc.
> 
> Instead use flock() to lock the file. It does not matter whether or
> not the file exists but whether an exclusive lock can be obtained.
> 
> Also remove the hint about removing the file from pacman.
> 
> Signed-off-by: Christian Hesse <mail@eworm.de>
> ---
>  lib/libalpm/handle.c | 17 +++++++++++++++--
>  src/pacman/util.c    |  5 -----
>  2 files changed, 15 insertions(+), 7 deletions(-)

Refusing to run when a lock file is leftover from a previous
invocation is intentional.  It serves as an indicator that the user
needs to verify the integrity of their system.  Also, see
https://lists.archlinux.org/pipermail/pacman-dev/2013-August/017733.html
for previous discussion regarding flock().

apg
Allan McRae June 6, 2017, 10:06 p.m. UTC | #2
On 07/06/17 07:56, Christian Hesse wrote:
> Allan McRae <allan@archlinux.org> on Wed, 2017/06/07 07:37:
>> On 07/06/17 07:26, Christian Hesse wrote:
>>> From: Christian Hesse <mail@eworm.de>
>>>
>>> We used to check for file existens, but that suffers from stale lock
>>> files caused by unexpected events like crash, shutdown, etc.
>>>
>>> Instead use flock() to lock the file. It does not matter whether or
>>> not the file exists but whether an exclusive lock can be obtained.
>>>
>>> Also remove the hint about removing the file from pacman.  
>>
>> This does not work with NFS filesystems.
> 
> Uh, did not know NFS has issues here. However, this is from NFS FAQ [0]:
> 
>> The NFS client in 2.6.12 provides support for flock()/BSD locks on NFS files
>> by emulating the BSD-style locks in terms of POSIX byte range locks. Other
>> NFS clients that use the same emulation mechanism, or that use fcntl()/POSIX
>> locks, will then see the same locks that the Linux NFS client sees.
> 
> Given the fact that linux 2.6.12 has been released nearly twelve years ago...
> We still do not want to use this?
> 
> Does any other locking mechanism has a chance to be accepted?
> 
> [0] http://nfs.sourceforge.net/#faq_d10
> 

Bringing back on list.

We has issues with NFS/OSX last time this was run (need to look up last
time this was discussed).  I think out support for OSX is dead, so we
could ignore that...

I would not mind flock being used to lock individual files, but leaving
an overall lock file during a transaction is useful.

A
Christian Hesse June 6, 2017, 10:11 p.m. UTC | #3
Andrew Gregory <andrew.gregory.8@gmail.com> on Tue, 2017/06/06 17:56:
> On 06/06/17 at 11:26pm, Christian Hesse wrote:
> > From: Christian Hesse <mail@eworm.de>
> > 
> > We used to check for file existens, but that suffers from stale lock
> > files caused by unexpected events like crash, shutdown, etc.
> > 
> > Instead use flock() to lock the file. It does not matter whether or
> > not the file exists but whether an exclusive lock can be obtained.
> > 
> > Also remove the hint about removing the file from pacman.
> > 
> > Signed-off-by: Christian Hesse <mail@eworm.de>
> > ---
> >  lib/libalpm/handle.c | 17 +++++++++++++++--
> >  src/pacman/util.c    |  5 -----
> >  2 files changed, 15 insertions(+), 7 deletions(-)  
> 
> Refusing to run when a lock file is leftover from a previous
> invocation is intentional.  It serves as an indicator that the user
> needs to verify the integrity of their system.  Also, see
> https://lists.archlinux.org/pipermail/pacman-dev/2013-August/017733.html
> for previous discussion regarding flock().

Thanks for the heads-up. I did not know this thread. Looks like I should
invest more time searching for this kind of information before reinventing
the wheel.

Perhaps anybody should add a comment to lib/libalpm/handle.c why we are *not*
adding flock(). ;)

BTW, it is interesting what systems actually do run pacman...
Christian Hesse June 6, 2017, 10:19 p.m. UTC | #4
Allan McRae <allan@archlinux.org> on Wed, 2017/06/07 08:06:
> On 07/06/17 07:56, Christian Hesse wrote:
> > Allan McRae <allan@archlinux.org> on Wed, 2017/06/07 07:37:  
> >> On 07/06/17 07:26, Christian Hesse wrote:  
> >>> From: Christian Hesse <mail@eworm.de>
> >>>
> >>> We used to check for file existens, but that suffers from stale lock
> >>> files caused by unexpected events like crash, shutdown, etc.
> >>>
> >>> Instead use flock() to lock the file. It does not matter whether or
> >>> not the file exists but whether an exclusive lock can be obtained.
> >>>
> >>> Also remove the hint about removing the file from pacman.    
> >>
> >> This does not work with NFS filesystems.  
> > 
> > Uh, did not know NFS has issues here. However, this is from NFS FAQ [0]:
> >   
> >> The NFS client in 2.6.12 provides support for flock()/BSD locks on NFS
> >> files by emulating the BSD-style locks in terms of POSIX byte range
> >> locks. Other NFS clients that use the same emulation mechanism, or that
> >> use fcntl()/POSIX locks, will then see the same locks that the Linux NFS
> >> client sees.  
> > 
> > Given the fact that linux 2.6.12 has been released nearly twelve years
> > ago... We still do not want to use this?
> > 
> > Does any other locking mechanism has a chance to be accepted?
> > 
> > [0] http://nfs.sourceforge.net/#faq_d10
> >   
> 
> Bringing back on list.
> 
> We has issues with NFS/OSX last time this was run (need to look up last
> time this was discussed).  I think out support for OSX is dead, so we
> could ignore that...
> 
> I would not mind flock being used to lock individual files, but leaving
> an overall lock file during a transaction is useful.

Andrew provided a link. ;)

There are some valid arguments against flock()... So yes, let's keep things
as-is. :-p

Patch

diff --git a/lib/libalpm/handle.c b/lib/libalpm/handle.c
index b6b27881..bb5cc907 100644
--- a/lib/libalpm/handle.c
+++ b/lib/libalpm/handle.c
@@ -24,6 +24,7 @@ 
 #include <stdlib.h>
 #include <string.h>
 #include <limits.h>
+#include <sys/file.h>
 #include <sys/types.h>
 #include <syslog.h>
 #include <sys/stat.h>
@@ -120,10 +121,20 @@  int _alpm_handle_lock(alpm_handle_t *handle)
 	FREE(dir);
 
 	do {
-		handle->lockfd = open(handle->lockfile, O_WRONLY | O_CREAT | O_EXCL | O_CLOEXEC, 0000);
+		handle->lockfd = open(handle->lockfile, O_WRONLY | O_CREAT | O_CLOEXEC, 0000);
 	} while(handle->lockfd == -1 && errno == EINTR);
 
-	return (handle->lockfd >= 0 ? 0 : -1);
+	if (handle->lockfd == -1)
+		return -1;
+
+	/* try to get an exclusive lock */
+	if (flock(handle->lockfd, LOCK_EX | LOCK_NB) == -1) {
+		close(handle->lockfd);
+		handle->lockfd = -1;
+		return -1;
+	}
+
+	return 0;
 }
 
 /** Remove the database lock file
@@ -137,6 +148,8 @@  int SYMEXPORT alpm_unlock(alpm_handle_t *handle)
 	ASSERT(handle->lockfile != NULL, return 0);
 	ASSERT(handle->lockfd >= 0, return 0);
 
+	flock(handle->lockfd, LOCK_UN);
+
 	close(handle->lockfd);
 	handle->lockfd = -1;
 
diff --git a/src/pacman/util.c b/src/pacman/util.c
index ae8a74d3..c3f9d3ba 100644
--- a/src/pacman/util.c
+++ b/src/pacman/util.c
@@ -81,13 +81,8 @@  void trans_init_error(void)
 	pm_printf(ALPM_LOG_ERROR, _("failed to init transaction (%s)\n"),
 			alpm_strerror(err));
 	if(err == ALPM_ERR_HANDLE_LOCK) {
-		const char *lockfile = alpm_option_get_lockfile(config->handle);
 		pm_printf(ALPM_LOG_ERROR, _("could not lock database: %s\n"),
 					strerror(errno));
-		if(access(lockfile, F_OK) == 0) {
-			fprintf(stderr, _("  if you're sure a package manager is not already\n"
-						"  running, you can remove %s\n"), lockfile);
-		}
 	}
 }