Commit graph

275 commits

Author SHA1 Message Date
Allan McRae
c9acfc2b50 Fix error when downloading signature file for an existing package file
If a package was already downloaded but its signature file was not,
pacman would download the signature then error out despite all files
being present.

Also fixes a similar error when some, but not all, package databases
were updated with -Sy.

Fixes #156

Signed-off-by: Allan McRae <allan@archlinux.org>
2024-06-19 17:38:07 +10:00
Remi Gacogne
eacadbcc41
Restrict filesystem access to the download process whenever possible
Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
2024-06-14 09:30:20 +02:00
Remi Gacogne
24304c6df0 Fix up-to-date repo databases being redownloaded when sandboxed
Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-06-10 19:48:20 +10:00
Remi Gacogne
cfa68f7b26 Restore partially downloaded files to the temporary directory
This allows downloads to be continued.

Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Remi Gacogne
e1a7b83e8e Download to a temporary directory owned by the Download user
Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Remi Gacogne
5e9bff6216 Stop trusting the Content-Disposition HTTP header 2024-04-01 20:52:55 +00:00
Allan McRae
26b7b35307 Remove random_partfile from payload struct
It is not used any more due to filling the payload structure earlier.

Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Allan McRae
04d04381bc libalpm: fill in more payload information before passing to downloader
Filling in more of the payload fields before passing to the downloader ensures
that the these fields do not get lost during sandboxed operations.

It also fixes the use of -U with XferCommand, but testsuite still fails due to
"404" page being downloaded for the signature. Given we can not identify this
as being a non-signature download with the XferCommand, we can just turn off
signature checking in this test.

Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Remi Gacogne
cf359b0da4 Add support for DownloadUser with XferCommand
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Remi Gacogne
93a796aa27 Add sandboxed download for the internal downloader
If the SandboxUser configure option is set, the internal downloader
will fork of a child process and drop to the specified user to download
the files.

Signed-off-by: Remi Gacogne <rgacogne@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-04-01 20:52:55 +00:00
Demi Obenour
eb5bf69138 Fetch signature and database from the same URL
Previously, the for loops on lines 1035 and 1037 would advance to the
next element in the server list, even if downloading the URL succeeded.
If there are no more servers in the list, `s` would be NULL, causing
a NULL pointer dereference on line 1046.  If there were servers left
in the list, the signature would be downloaded from a wrong URL.

1. Fetching of database signatures is enabled.
2. There is only one enabled remote repository URL, or fetching from
   all but the last one fails and fetching from the last one succeeds.
3. An XferCommand is used.

Qubes OS Arch templates satisfy all of these conditions and trigger the bug.
2024-03-19 11:44:38 +10:00
Allan McRae
f343db5b8e Do not segfault with badly formed URL
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-02-28 07:38:56 +10:00
Allan McRae
d55b47e551 Update copyright years
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-02-24 18:40:44 +10:00
Masato TOYOSHIMA
2180e4d127 libalpm: download signatures with the external downloader
Ensure relevant signature files are downloaded when using the fetch
callback.

Signed-off-by: Allan McRae <allan@archlinux.org>
2024-02-16 19:27:09 +10:00
morganamilo
bf76b5e89f libalpm: correctly log curl_download_internal return value
Signed-off-by: Allan McRae <allan@archlinux.org>
2024-02-04 10:23:34 +10:00
Andrew Gregory
3aa1975c1d alpm: add cache server support
Cache servers differ from regular servers in that they do not produce
warnings and are not removed from the server pool for "soft errors"
(i.e. the server was reachable, but the download failed) and they are
not used for databases.  If a host is used for both a cache server and a
regular server, it may still be removed from the server pool for soft
errors that occur when used as cache server and removal from the server
pool for soft errors will not affect future attempted use as a cache
server.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2023-12-02 04:56:25 +00:00
Andrew Gregory
56626816b6 dload: differentiate between hard and soft errors
Set error count to -1 to indicate a hard error to allow them to be
treated differently.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2023-12-02 04:56:25 +00:00
Allan McRae
471a030466 Avoid NULL deference in curl_check_finished_download
We have not set handle in the function at this stage, so we can not
assign an error to it.  Pass the handle to the function to avoid
waiting until the payload is retrieved.

Signed-off-by: Allan McRae <allan@archlinux.org>
2022-12-13 10:00:13 +10:00
Chris Down
015eb31c3a dload: Remove unused ABORT_SIGINT
The last user of ABORT_SIGINT was removed in commit 84723cab5d
("Cleanup the old sequential download code"), and this isn't exported as
part of the public API.

Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Allan McRae <allan@archlinux.org>
2022-07-21 20:00:44 +10:00
Allan McRae
40583ebe89 Avoid information leakage with badly formed download header
Parsing of Content-Disposition relies on well formed headers.
A malformed header such as:

Content-Disposition="";

will result in a strnduppayload->content_disp_name, -1, ptr),
which will copy memory until it hits a \0.

Prevent this by only copying the value if it exists.

Fixes FS#73704.

Signed-off-by: Allan McRae <allan@archlinux.org>
2022-03-06 21:49:56 +10:00
Allan McRae
90df85e9cf Update copyright years
./build-aux/update-copyright 2021 2022

Signed-off-by: Allan McRae <allan@archlinux.org>
2022-01-02 13:34:52 +10:00
Allan McRae
5352367022 Prevent translation of curl
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-11-20 12:39:42 -08:00
morganamilo
b0a2fd75b2 Update mailing list url
change pacman-dev@archlinux.org to pacmandev@lists.archlinux.org

Most of this is copyright notices but this also fixes FS#72129 by
updating the address in docs/index.asciidoc.
2021-11-20 12:38:25 -08:00
Vladimir Panteleev
b242f5f24c libalpm: Log URLs when retrying
Allow finding which mirror was used to fetch a file.

This makes it a bit easier to debug situations in which mirrors serve
bad files with HTTP 200.

Signed-off-by: Vladimir Panteleev <archlinux@cy.md>
2021-11-20 12:36:29 -08:00
Charlie Sale
efb714b31c Order downloads by descending max_size
When downloading in parallel, sort by package size so that the larger
packages are queued first to fully leverage parallelism.
Addresses FS#70172

Signed-off-by: Charlie Sale <softwaresale01@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-09-04 10:34:00 +10:00
morganamilo
2ec6de96a6 only use effective url for urls containing .db or .pkg
Github and other sites redirect their downloads to a cdn. So the
download http://foo.org/myrepo.db may redirect to something like
https://cdn.foo.org/83749327439.

This then causes pacman to try and download the sig as
https://cdn.foo.org/83749327439.sig which is incorrect. In this case
pacman should append .sig to the original url.

However urls like https://archlinux.org/packages/community/x86_64/0ad/download/
Redirect to the mirror, so .sig has to appended after the redirects and
not before.

So we decide if we should append .sig on the original or effective url
based on if the effective url (minus the query part) has .db or .pkg in it.

Fixes FS#71148

---

v2: move variable decleration to start of block
v3: use dbext instead of db
2021-09-04 10:34:00 +10:00
morganamilo
c0026caab0 libalpm: Give -U downloads a random .part name if needed
archweb's download links all ended in /download. This cause all the temp
files to be named download.part. With parallel downloads this results in
multiple downloads to go to the same temp file and breaks the transaction.

Assign random temporary filenames to downloads from URLs that are either
missing a filename, or if the filename does not contain at least three
hyphens (as a well formed package filename does).

While this approach to determining when to use a temporary filename is
not 100% foolproof, it does keep nice looking download progress bar names
when a proper package filename is given. The only downside of not using
temporary files when provided with a filename  with three or more hyphens
is URLs created specifically to bypass temporary filename usage can not
be downloaded in parallel. We probably do not want to download packages
from such URLs anyway.

Fixes FS#71464

Modified-by: Allan McRae (do not use temporary files for realish URLs)
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-09-04 10:33:51 +10:00
morganamilo
0147de169a libalpm: always name signatures after original file
If the original download redirects to to a different url then alpm would
try to name the sig file after the url instead of <original_file>.sig.
Instead force this naming scheme regardless of url.

Fixes FS#71274

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-24 23:51:11 +10:00
morganamilo
238109760d libalpm: call retry events for sig downlods
Around the same time retry events were added, there was a patch to pass
sig download events to the frontend. The retry code was not updated to
account for this.

Signed-off-by: morganamilo <morganamilo@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-07 14:14:19 +10:00
Allan McRae
3401f9e142 libalpm: prevent download error pages ending up in package files
Some servers respond with error pages (e.g. 404.html) when a package is
not present. These were getting written to packages before moving onto
the next server. Reset the download progress on 400+ error conditions
to avoid this.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-03 09:30:10 +10:00
morganamilo
8bf17b29a2 libalpm: fix download rates becoming negative
When a download fails on one mirror a new download is started on the
next mirror. This causes the ammount downloaded to reset, confusing the
rate math and making it display a negative rate.

This is further complicated by the fact that a download may be resumed
from where it is or started over.

To account for this we alert the frontend that the download was
restarted. Pacman then starts the progress bar over.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-05-09 23:28:04 +10:00
Andrew Gregory
72238aa046 call download progress callback for signatures
pacman may not care about them, but other front-ends do.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
eb1a63a516 alpm_db_update: indicate if dbs were up to date
Restore the prior indicator whether or not databases were up to date.
0 is used to indicate if *any* db was actually updated as callers are
more likely to care about that than if *all* dbs were updated.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
0ff94ae85d fix downloading multiple urls with XferCommand
An extra break causes _alpm_download to break out of the payload loop as
soon as it sees a successful url download with XferCommand.

Fixes: FS#70608 - -U fails to download all files with XferCommand

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
e7fa35baa2 add front-end provided context to callbacks
Our callbacks require front-ends to maintain state in order to provide
reasonable output.  The new download callback in particular requires
much more complex state information to be saved.  Without the ability to
provide context, state must be saved globally, which may not be possible
for all front-ends.  Scripting language bindings in particular have no
way to register per-handle callbacks without some form of context.

Implements: FS#12721

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
9060058393 include retries and signatures in download count
Fixes: FS#69881

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-04-07 22:34:29 +10:00
Andrew Gregory
4bf7aa119d skip servers with too many errors
Keep track of errors from servers so that bad ones can be skipped once
a threshold is reached.  Key the error tracking off the hostname because
hosts may serve multiple repos under different url's and errors are
likely to be host-wide.

Implements: FS#29293.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-04-07 22:33:52 +10:00
Anatol Pomozov
1e60a5f006 Remove "total download" callback in favor of generic event callback
Total download callback called right before packages start downloaded.
But we already have an event for such event (ALPM_EVENT_PKG_RETRIEVE_START)
and it is naturally to use the event to pass information about expected
download size.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-03-25 11:39:03 +10:00
Allan McRae
17f9911ffc Update copyright year
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-03-01 12:22:20 +10:00
morganamilo
f5b373788f libalpm: don't use curl's deprecated functions
This bumps the minimun curl version from 7.32.0 to 7.55.0.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-01-09 00:12:19 +10:00
morganamilo
7cc8e0181f libalpm: remove useless if
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-01-09 00:11:44 +10:00
morganamilo
4b8c274f7f libalpm: don't call dlcb when not set
Fixes FS#68728:

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-11-26 16:31:27 +10:00
Anatol Pomozov
14c0e53eed Check that destfile_name exists before using it
In some cases (when trust_remote_name is used for a URL without a filename and
no Content-Disposition is provided by the server) destfile_name will be
NULL. In this case payload data will be stored in tempfile_name and no
destfile_name is set.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:43:10 +10:00
Anatol Pomozov
1fd95939db Do not free payload fields in the middle of this structure use
At the end of payload use it calls _alpm_dload_payload_reset()
that will free() these and other fields anyway.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:41:45 +10:00
Anatol Pomozov
a8bdc2e10a Build signature remote name based on the main payload name
The main payload final name might be affected by url redirects or
Content-Disposition HTTP header value.

We want to make sure that accompanion *.sig filename always matches the
package filename. So ignore finalname/Content-Disposition for the *.sig file.

It also helps to fix a corner case when the download URL does not contain
a filename and server provides Content-Disposition for the main payload
but not for the signature payload.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:39:37 +10:00
Anatol Pomozov
f3dfba73d2 FS#33992: force download *.sig file if it does not exist in the cache
In case if *.pkg exists but *.sig file does not we still have to pass
the pkg to multi_download API.

To avoid redownloading *.pkg file we use CURLOPT_TIMECONDITION curl option.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07 21:38:00 +10:00
Anatol Pomozov
f078c2d3bc Move signature payload creation to download engine
Until now callee of ALPM download functionality has been in charge of
payload creation both for the main file (e.g. *.pkg) and for the accompanied
*.sig file. One advantage of such solution is that all payloads are
independent and can be fetched in parallel thus exploiting the maximum
level of download parallelism.

To build *.sig file url we've been using a simple string concatenation:
$requested_url + ".sig". Unfortunately there are cases when it does not
work. For example an archlinux.org "Download From Mirror" link looks like
this https://www.archlinux.org/packages/core/x86_64/bash/download/ and
it gets redirected to some mirror. But if we append ".sig" to the end of
the link url and try to download it then archlinux.org returns 404 error.

To overcome this issue we need to follow redirects for the main payload
first, find the final url and only then append '.sig' suffix.
This implies 2 things:
 - the signature payload initialization need to be moved to dload.c
 as it is the place where we have access to the resolved url
 - *.sig is downloaded serially with the main payload and this reduces
 level of parallelism

Move *.sig payload creation to dload.c. Once the main payload is fetched
successfully we check if the callee asked to download the accompanied
signature. If yes - create a new payload and add it to mcurl.

*.sig payload does not use server list of the main payload and thus does
not support mirror failover. *.sig file comes from the same server as
the main payload.

Refactor event loop in curl_multi_download_internal() a bit. Instead of
relying on curl_multi_check_finished_download() to return number of new
payloads we simply rerun the loop iteration one more time to check if
there are any active downloads left.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07 21:35:35 +10:00
Anatol Pomozov
84723cab5d Cleanup the old sequential download code
All users of _alpm_download() have been refactored to the new API.
It is time to remove the old _alpm_download() functionality now.

This change also removes obsolete SIGPIPE signal handler functionality
(this is a leftover from libfetch days).

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-06-26 15:59:16 +10:00
Anatol Pomozov
16d98d6577 Convert '-U pkg1 pkg2' codepath to parallel download
Installing remote packages using its URL is an interesting case for ALPM
API. Unlike package sync ('pacman -S pkg1 pkg2') '-U' does not deal with
server mirror list. Thus _alpm_multi_download() should be able to
handle file download for payloads that either have 'fileurl' field
or pair of fields ('servers' and 'filepath') set.

Signature for alpm_fetch_pkgurl() has changed and it accepts an
output list that is populated with filepaths to fetched packages.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-06-26 15:59:08 +10:00
Anatol Pomozov
0346e0eef2 Convert download packages logic to multiplexed API
Create a list of dload_payloads and pass it to the new _alpm_multi_*
interface.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:39 +10:00