Commit Graph

22 Commits

Author SHA1 Message Date
Dmitry Klochkov 15b9496135 abuild-fetch: try to work around an ESTALE error which occurs on NFS
This is because of the following race condition case:

  A                               B
                                |
  lockfd = open(lockfile, ...)  |
                                | unlink(lockfile)
  lockf(lockfd, F_LOCK, 0)      |

According to [1], to recover from an ESTALE error, an application must
close the file or directory where the error occurred, and reopen it so
the NFS client can resolve the pathname again and retrieve the new file
handle.

[1] https://nfs.sourceforge.net/#faq_a10
2024-10-07 19:50:19 +00:00
Natanael Copa 40ecb4b07c abuild-fetch: try harder to yield
Try a bit harder to let other process aquire lock.

This will hopefully reduce flakiness of testsuite when builder is under
load.
2023-04-18 11:22:54 +02:00
Samanta Navarro dc99ce423a abuild: fix typos
Typos found with codespell
2021-09-21 09:15:34 +00:00
Natanael Copa 3da770ce35 abuild-fetch: simplify and fix locking
Simplify locking by using lockf(3). It is POSIX compatible and should
work over NFS.

Fix download race condition when:
1) host A creates lockfile and aquire lock to fetch from distfiles
   mirror
2) host B opens the lockfile and waits for lock
3) host A gets 404 from distfiles, releases lock and deletes the
   lockfile, which host A has an open file handle for
4) host B gets lock of the deleted file and downloads file
5) host A retries download and creates a new lockfile, but is not
   blocked by host B, even if it should

Solve this by releaseing the lock, give the other processes a chance
to aquire it (using sleep(0)), and then only delete the lockfile if:
a) download was successful (no 404) or b) no-one else has a lock.

This reverts commit 281720ec39 (abuild-fetch: aquire a second lock
using flock(2))

fixes #10026
2021-05-06 13:03:14 +02:00
Natanael Copa 281720ec39 abuild-fetch: aquire a second lock using flock(2)
It seems that POSIX record lock does not work across namespaces. Use a
second lock using flock.

see https://gitlab.alpinelinux.org/alpine/abuild/-/issues/10026
2021-04-20 17:05:40 +02:00
Natanael Copa 1772495d29 abuild-fetch: refactor move locking logic to a func
make code more readable by move the locking/unlocking to its own
functions.
2021-04-20 16:58:02 +02:00
Natanael Copa 42a45c9cbc abuild-fetch: mention -k toption for insecure in usage 2021-04-20 12:58:32 +02:00
Natanael Copa 2be7002cda abuild-fetch: retry download if byte range is unsupported
fixes #10004
2020-07-08 10:10:26 +02:00
Natanael Copa 606174552e abuild-fetch: adjust maxlength of outfile
so we have buffer space for the ".part" suffix.
2020-07-06 10:59:56 +00:00
tcely c9d6159637 abuild-fetch: use local insecure variable 2019-07-17 12:02:13 +00:00
tcely 59c1c4a97a abuild-fetch: when http:// was used, ignore https:// problems 2019-07-17 12:02:13 +00:00
tcely 7bd32679b3 abuild-fetch: add -k (insecure as in curl) option 2019-07-17 12:02:13 +00:00
tcely 77746a0c3d abuild-fetch: enable curl certificate verification 2019-04-29 18:31:58 +00:00
Natanael Copa c6609b4739 move logic of curl's http range error to abuild-fetch
Move the logic of deleting partial downloads to abuild-fetch, which
knows if it is curl or wget that was executed.
2018-10-03 09:23:16 +00:00
Oliver Smith 07d9f3bf6b Fix: incomplete partfile gets renamed to distfile
Abuild-fetch uses curl (fallback to wget) to download files. They are
saved with a ".part" extension first, so they can be resumed if
necessary. When the download is through, the ".part" extension gets
removed. However, when the server does not support resume of downloads
(e.g. GitHub's on the fly generated tarballs), then the ".part"
extension got removed anyway. Abuild aborts in that case. But when
running a third time, the distfile exists and it is assumed that this
is the full download.

Changes:
* abuild-fetch:
  * Only remove the ".part" extension, when curl/wget exit with 0
  * Pass the exit code from curl/wget as exit code of abuild-fetch
  * Wherever abuild-fetch would return an exit code on its own, the
    codes have been changed to be > 200 (so they don't collide with
    curl's as of now 92 exit codes)
  * Remove undocumented feature of downloading multiple source URLs at
    a time. This doesn't match with the usage description, was not used
    in abuild at all and it would have made it impossible to pass the
    exit code.
* abuild:
  * After downloading, when curl is installed and abuild-fetch has
    33 as exit code (curl's HTTP range error), then delete the partfile
    and try the download again.
2018-10-03 08:33:52 +00:00
Jonathan Neuschäfer 33183dadf5 Fix a few typos 2018-04-11 14:09:32 +00:00
tmpfile f9132fad76 abuild-fetch.c: remove saveas- syntax 2017-06-21 18:14:18 +00:00
Natanael Copa 9de1cfbf03 abuild-fetch: fix -Wformat-security warnings 2016-05-20 10:22:36 +02:00
Natanael Copa 5dfc67bf33 abuild-fetch: retry to create lock on ESTALE 2016-05-16 13:15:42 +00:00
Natanael Copa 575cece65e abuild-fetch: use _exit after execvp 2016-03-10 13:48:54 +00:00
Andrew Wilcox cd3eabdf4d abuild-fetch: add missing header 2015-10-08 08:30:32 +00:00
Natanael Copa 92186b70ca abuild: fix fetch lock file on nfs
flock(2) on an NFS mount will on the server side convert the lock to a
POSIX lock (fcntl(F_SETLK)). This means that abuild running on NFS
server and client will create different locks and they will both try
download same file at same time.

We fix this by creating a small abuild-fetch application that will
create a POSIX lock which works with NFS.
2015-08-26 16:44:23 +02:00