Commit graph

248 commits

Author SHA1 Message Date
morganamilo
0147de169a libalpm: always name signatures after original file
If the original download redirects to to a different url then alpm would
try to name the sig file after the url instead of <original_file>.sig.
Instead force this naming scheme regardless of url.

Fixes FS#71274

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-24 23:51:11 +10:00
morganamilo
238109760d libalpm: call retry events for sig downlods
Around the same time retry events were added, there was a patch to pass
sig download events to the frontend. The retry code was not updated to
account for this.

Signed-off-by: morganamilo <morganamilo@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-07 14:14:19 +10:00
Allan McRae
3401f9e142 libalpm: prevent download error pages ending up in package files
Some servers respond with error pages (e.g. 404.html) when a package is
not present. These were getting written to packages before moving onto
the next server. Reset the download progress on 400+ error conditions
to avoid this.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-06-03 09:30:10 +10:00
morganamilo
8bf17b29a2 libalpm: fix download rates becoming negative
When a download fails on one mirror a new download is started on the
next mirror. This causes the ammount downloaded to reset, confusing the
rate math and making it display a negative rate.

This is further complicated by the fact that a download may be resumed
from where it is or started over.

To account for this we alert the frontend that the download was
restarted. Pacman then starts the progress bar over.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-05-09 23:28:04 +10:00
Andrew Gregory
72238aa046 call download progress callback for signatures
pacman may not care about them, but other front-ends do.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
eb1a63a516 alpm_db_update: indicate if dbs were up to date
Restore the prior indicator whether or not databases were up to date.
0 is used to indicate if *any* db was actually updated as callers are
more likely to care about that than if *all* dbs were updated.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
0ff94ae85d fix downloading multiple urls with XferCommand
An extra break causes _alpm_download to break out of the payload loop as
soon as it sees a successful url download with XferCommand.

Fixes: FS#70608 - -U fails to download all files with XferCommand

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
e7fa35baa2 add front-end provided context to callbacks
Our callbacks require front-ends to maintain state in order to provide
reasonable output.  The new download callback in particular requires
much more complex state information to be saved.  Without the ability to
provide context, state must be saved globally, which may not be possible
for all front-ends.  Scripting language bindings in particular have no
way to register per-handle callbacks without some form of context.

Implements: FS#12721

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
2021-05-01 12:08:14 +10:00
Andrew Gregory
9060058393 include retries and signatures in download count
Fixes: FS#69881

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-04-07 22:34:29 +10:00
Andrew Gregory
4bf7aa119d skip servers with too many errors
Keep track of errors from servers so that bad ones can be skipped once
a threshold is reached.  Key the error tracking off the hostname because
hosts may serve multiple repos under different url's and errors are
likely to be host-wide.

Implements: FS#29293.

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-04-07 22:33:52 +10:00
Anatol Pomozov
1e60a5f006 Remove "total download" callback in favor of generic event callback
Total download callback called right before packages start downloaded.
But we already have an event for such event (ALPM_EVENT_PKG_RETRIEVE_START)
and it is naturally to use the event to pass information about expected
download size.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-03-25 11:39:03 +10:00
Allan McRae
17f9911ffc Update copyright year
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-03-01 12:22:20 +10:00
morganamilo
f5b373788f libalpm: don't use curl's deprecated functions
This bumps the minimun curl version from 7.32.0 to 7.55.0.

Signed-off-by: Allan McRae <allan@archlinux.org>
2021-01-09 00:12:19 +10:00
morganamilo
7cc8e0181f libalpm: remove useless if
Signed-off-by: Allan McRae <allan@archlinux.org>
2021-01-09 00:11:44 +10:00
morganamilo
4b8c274f7f libalpm: don't call dlcb when not set
Fixes FS#68728:

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-11-26 16:31:27 +10:00
Anatol Pomozov
14c0e53eed Check that destfile_name exists before using it
In some cases (when trust_remote_name is used for a URL without a filename and
no Content-Disposition is provided by the server) destfile_name will be
NULL. In this case payload data will be stored in tempfile_name and no
destfile_name is set.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:43:10 +10:00
Anatol Pomozov
1fd95939db Do not free payload fields in the middle of this structure use
At the end of payload use it calls _alpm_dload_payload_reset()
that will free() these and other fields anyway.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:41:45 +10:00
Anatol Pomozov
a8bdc2e10a Build signature remote name based on the main payload name
The main payload final name might be affected by url redirects or
Content-Disposition HTTP header value.

We want to make sure that accompanion *.sig filename always matches the
package filename. So ignore finalname/Content-Disposition for the *.sig file.

It also helps to fix a corner case when the download URL does not contain
a filename and server provides Content-Disposition for the main payload
but not for the signature payload.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-14 23:39:37 +10:00
Anatol Pomozov
f3dfba73d2 FS#33992: force download *.sig file if it does not exist in the cache
In case if *.pkg exists but *.sig file does not we still have to pass
the pkg to multi_download API.

To avoid redownloading *.pkg file we use CURLOPT_TIMECONDITION curl option.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07 21:38:00 +10:00
Anatol Pomozov
f078c2d3bc Move signature payload creation to download engine
Until now callee of ALPM download functionality has been in charge of
payload creation both for the main file (e.g. *.pkg) and for the accompanied
*.sig file. One advantage of such solution is that all payloads are
independent and can be fetched in parallel thus exploiting the maximum
level of download parallelism.

To build *.sig file url we've been using a simple string concatenation:
$requested_url + ".sig". Unfortunately there are cases when it does not
work. For example an archlinux.org "Download From Mirror" link looks like
this https://www.archlinux.org/packages/core/x86_64/bash/download/ and
it gets redirected to some mirror. But if we append ".sig" to the end of
the link url and try to download it then archlinux.org returns 404 error.

To overcome this issue we need to follow redirects for the main payload
first, find the final url and only then append '.sig' suffix.
This implies 2 things:
 - the signature payload initialization need to be moved to dload.c
 as it is the place where we have access to the resolved url
 - *.sig is downloaded serially with the main payload and this reduces
 level of parallelism

Move *.sig payload creation to dload.c. Once the main payload is fetched
successfully we check if the callee asked to download the accompanied
signature. If yes - create a new payload and add it to mcurl.

*.sig payload does not use server list of the main payload and thus does
not support mirror failover. *.sig file comes from the same server as
the main payload.

Refactor event loop in curl_multi_download_internal() a bit. Instead of
relying on curl_multi_check_finished_download() to return number of new
payloads we simply rerun the loop iteration one more time to check if
there are any active downloads left.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-07-07 21:35:35 +10:00
Anatol Pomozov
84723cab5d Cleanup the old sequential download code
All users of _alpm_download() have been refactored to the new API.
It is time to remove the old _alpm_download() functionality now.

This change also removes obsolete SIGPIPE signal handler functionality
(this is a leftover from libfetch days).

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-06-26 15:59:16 +10:00
Anatol Pomozov
16d98d6577 Convert '-U pkg1 pkg2' codepath to parallel download
Installing remote packages using its URL is an interesting case for ALPM
API. Unlike package sync ('pacman -S pkg1 pkg2') '-U' does not deal with
server mirror list. Thus _alpm_multi_download() should be able to
handle file download for payloads that either have 'fileurl' field
or pair of fields ('servers' and 'filepath') set.

Signature for alpm_fetch_pkgurl() has changed and it accepts an
output list that is populated with filepaths to fetched packages.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
2020-06-26 15:59:08 +10:00
Anatol Pomozov
0346e0eef2 Convert download packages logic to multiplexed API
Create a list of dload_payloads and pass it to the new _alpm_multi_*
interface.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:39 +10:00
Anatol Pomozov
b96e0df4dc Implement multibar UI
Multiplexed download requires ability to draw UI for multiple active progress
bars. To implement it we use ANSI codes to move cursor up/down and then
redraw the required progress bar.
`pacman_multibar_ui.active_downloads` field represents the list of active
downloads that correspond to progress bars.
`struct pacman_progress_bar` is a data structure for a progress bar.

In some cases (e.g. database downloads) we want to keep progress bars in order.
In some other cases (package downloads) we want to move completed items to the
top of the screen. Function `multibar_move_completed_up` allows to configure
such behavior.

Per discussion in the maillist we do not want to show download progress for
signature files.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:39 +10:00
Anatol Pomozov
c78eb48d91 Extend download callback interface with start/complete events
With the previous download interface the callback uses the first progress
event as 'download has started' signal. Unfortunately it does not work with
up-to-date files that never receive 'download progress' events.
Up-to-date database messages are currently handled in sync_syncdbs()
after the sequential download is completed and a result from ALPM is
received. But this is not going to work with multiplexed download
interface that returns the result only after all files are completed.

Another problem with 'first progress event is the beginning of the
download' is that such events time are unpredictable. Thus the UI progress
bar order might differ from what has been passed by client to
alpm_dbs_update() function. We actually want to keep the dbs progress bars
in a strict order.

To help to solve the given problems extend the download callback to
allow 2 more events - download started and completed. 'Download started'
events appear in the same order as in the list given by a client.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Anatol Pomozov
6a331af27f Implement multiplexed download using mCURL
curl_multi_download_internal() is the main loop that creates up to
'ParallelDownloads' easy curl handles, adds them to mcurl and then
performs curl execution. This is when the paralled downloads happens.
Once any of the downloads complete the function checks its result.
In case if the download fails it initiates retry with the next server
from payload->servers list. At the download completion all the payload
resources are cleaned up.

curl_multi_check_finished_download() is essentially refactored version of
curl_download_internal() adopted for multi_curl. Once mcurl porting is
complete curl_download_internal() will be removed.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Anatol Pomozov
1d42a8f954 Implement _alpm_multi_download
It is an equivalent of _alpm_download but accepts a list of payloads.

curl_multi_download_internal() is a stub at this moment and will be
implemented in the later commits of this patch series.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Anatol Pomozov
fa68c33fa8 Inline dload_payload->curlerr field into a local variable
dload_payload->curlerr is a field that is used inside
curl_download_internal() function only. It can be converted to a local
variable.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Anatol Pomozov
dc98d0ea09 Add multi_curl handle to ALPM global context
To be able to run multiple download in parallel efficiently we need to
use curl_multi interface [1]. It introduces a set of APIs over new type
of handler 'CURLM'.

Create CURLM object at the application start and set it to global ALPM
context.

The 'single-download' CURL handle moves to payload struct. A new CURL
handle is created for each payload with intention to be processed by CURLM.

Note that curl_download_internal() is not ported to CURLM interface due
to the fact that the function will go away soon.

[1] https://curl.haxx.se/libcurl/c/libcurl-multi.html

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Anatol Pomozov
a8a1a1bb3e Introduce alpm_dbs_update() function for parallel db updates
This is an equivalent of alpm_db_update but for multiplexed (parallel)
download. The difference is that this function accepts list of
databases to update. And then ALPM internals download it in parallel if
possible.

Add a stub for _alpm_multi_download the function that will do parallel
payloads downloads in the future.

Introduce dload_payload->filepath field that contains url path to the
file we download. It is like fileurl field but does not contain
protocol/server part. The rationale for having this field is that with
the curl multidownload the server retry logic is going to move to a curl
callback. And the callback needs to be able to reconstruct the 'next'
fileurl. One will be able to do it by getting the next server url from
'servers' list and then concat with filepath. Once the 'parallel download'
refactoring is over 'fileurl' field will go away.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-05-09 11:58:21 +10:00
Allan McRae
6ba250e400 Use GOTO_ERR throughout
The GOTO_ERR define was added in commit 80ae8014 for use in future commits.
There are plenty of places in the code base it can be used, so convert them.

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-04-13 23:44:46 +10:00
Allan McRae
e76ec94083 build-aux/update-copyright 2019 2020
Signed-off-by: Allan McRae <allan@archlinux.org>
2020-02-10 10:46:03 +10:00
morganamilo
d0c487d4dc Docs docs docs
libalpm: move docs from .c files into alpm.h And fix/expand some
along the way.

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-01-28 10:46:27 +10:00
Allan McRae
e54617c7d5 Fix "pacman -U <url>" operations
Commit e6a6d307 detected complete part files by comparing a payload's
max_size to initial_size.  However, these values are also equal when we
use pacman -U on a URL as max_size is set to 0 in that case.  Add a further
condition to avoid that.

Signed-off-by: Allan McRae <allan@archlinux.org>
2020-01-27 17:53:50 +10:00
Dave Reisner
9883015be2 Use c99 struct initialization to avoid memset calls
This is guaranteed less error prone than calling memset and hoping the
human gets the argument order correct.
2020-01-07 11:40:32 +10:00
Allan McRae
e6a6d30793 Handle .part files that are the size of the correct package
In rare cases, likely due to a well timed Ctrl+C, but possibly due to a
broken mirror, a ".part" file may have size at least that of the correct
package size.

When encountering this issue, currently pacman fails in different ways
depending on where the package falls in the list to download.  If last,
"wrong or NULL argument passed" error is reported, or a "invalid or
corrupt package" issue if not.

Capture these .part files, and remove the extension. This lets pacman
either use the package if valid, or offer to remove it if it fails checksum
or signature verification.

Signed-off-by: Allan McRae <allan@archlinux.org>
2019-11-15 23:29:20 +10:00
Allan McRae
f37a3752b3 Update copyright years
make update-copyright OLD=2018 NEW=2019

Signed-off-by: Allan McRae <allan@archlinux.org>
2019-10-23 22:06:54 +10:00
Dave Reisner
0c4a8ae24b dload: never return NULL from get_filename
Downloads with a Content-Disposition header will typically not include
slashes. When they do, we should most certainly only take the basename,
but when they don't, we should treat the header value as the filename.

Crash introduced in d197d8ab82 when we started using get_filename
in order to rightfully avoid an arbitrary file overwrite vulnerability.

Signed-off-by: Allan McRae <allan@archlinux.org>
2019-10-07 10:55:49 +10:00
morganamilo
0e67ee55bd Correctly report a download failiure for 404s
Currently when caling alpm_trans_commit, if fetching a package restults
in a 404 (or other non 400 response code), the function returns -1 but
errno is never set.

This patch sets errno to ALPM_ERR_RETRIEVE.

Signed-off-by: Allan McRae <allan@archlinux.org>
2019-06-28 13:39:48 +10:00
Andrew Gregory
d197d8ab82 Sanitize file name received from Content-Disposition header
When installing a remote package with "pacman -U <url>", pacman renames
the downloaded package file to match the name given in the
Content-Disposition header. However, pacman does not sanitize this name,
which may contain slashes, before calling rename(). A malicious server (or
a network MitM if downloading over HTTP) can send a content-disposition
header to make pacman place the file anywhere in the filesystem,
potentially leading to arbitrary root code execution. Notably, this
bypasses pacman's package signature checking.

For example, a malicious package-hosting server (or a network
man-in-the-middle, if downloading over HTTP) could serve the following
header:

Content-Disposition: filename=../../../../../../usr/share/libalpm/hooks/evil.hook

and pacman would move the downloaded file to
/usr/share/libalpm/hooks/evil.hook. This invocation of "pacman -U" would
later fail, unable to find the downloaded package in the cache directory,
but the hook file would remain in place. The commands in the malicious
hook would then be run (as root) the next time any package is installed.

Discovered-by: Adam Suhl <asuhl@mit.edu>
Signed-off-by: Allan McRae <allan@archlinux.org>
2019-03-01 11:23:20 +10:00
Mark Ulrich
db102c67ef libalpm: prevent 301 redirect loop from hanging the process
If a mirror responds with a 301 redirect to itself, it will create an
infinite redirect loop. This will cause pacman to hang, unresponsive to
even a SIGINT. The result is pacman being unable to sync or
download any package from a particular repo if its current mirror
is stuck in a redirect loop. Setting libcurl's MAXREDIRS option
effectively prevents a redirect loop from hanging the process.

Signed-off-by: Mark Ulrich <mark.ulrich.86@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2019-02-07 09:51:01 +10:00
Olivier Brunel
ffde85aadf alpm: Fix SIGINT handling re: aborting download
Upon receiving SIGINT a flag is set to abort the (curl) download.
However, since it was never reset/initialized, if a front-end doesn't
actually exit on SIGINT, and later tries any operation that needs to
perform a new download, said download would always get aborted right
away due to the flag not having been reset.
2018-10-17 17:28:32 -07:00
Olivier Brunel
d96d0ffe7c alpm: Do not raise SIGINT when filesize goes over limit
Variable dload_interrupted is used both to abort a download because
SIGINT was caught, and when a file limit is reached. But raising SIGINT
is only meant to happen in the first case.

Signed-off-by: Olivier Brunel <jjk@jjacky.com>
2018-10-17 17:28:32 -07:00
Michael Straube
9e960d9d5a libalpm/dload.c: add case for CURLE_COULDNT_RESOLVE_HOST
Add a case for curl error 'Could not resolve host'.
An attempt to fix FS#48285.

Signed-off-by: Michael Straube <straubem@gmx.de>
Signed-off-by: Allan McRae <allan@archlinux.org>
2018-08-10 12:37:19 +10:00
Michael Straube
72263e22bd libalpm/dload.c: fix filename in license header
The filename in the license header did not match the actual filename
as in the other files. Hopefully this is not too nit-picky.

Signed-off-by: Michael Straube <straubem@gmx.de>
Signed-off-by: Allan McRae <allan@archlinux.org>
2018-06-18 16:28:04 +10:00
Eli Schwartz
860e4c4943 Remove all modelines from the project
Many of these are pointless (e.g. there is no need to explicitly turn on
spellchecking and language dictionaries for the manpages by default).

The only useful modelines are the ones enforcing the project coding
standards for indentation style (and "maybe" filetype/syntax, but
everything except the asciidoc manpages and makepkg.conf is already
autodetected), and indent style can be applied more easily with
.editorconfig

Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Allan McRae <allan@archlinux.org>
2018-05-14 09:59:15 +10:00
Allan McRae
b6bb8cb7dc Update coyrights for 2018
make update-copyright OLD=2017 NEW=201

Signed-off-by: Allan McRae <allan@archlinux.org>
2018-03-14 13:31:31 +10:00
Andrew Gregory
59bb21fce3 dload: ensure callback is always initialized once
Frontends rely on an initialization call for setup between downloads.
Checking for intialization after checking for a completed download can
skip initialization in cases where files are small enough to be
downloaded all at once (FS#56408).  Relying on previous download size
can result in multiple initializations if there are multiple
non-transfer events prior to the download starting (fS#56468).

Introduce a new cb_initialized variable to the payload struct and use it
to ensure that the callback is initialized exactly once prior to any
actual events.

Fixes FS#56408, FS#56468

Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
2018-01-06 12:59:32 +10:00
Christian Hesse
c635f185ba Introduce a 'disable-download-timeout' option
Add command line option ('--disable-download-timeout') and config file
option ('DisableDownloadTimeout') to disable defaults for low speed
limit and timeout on downloads. Use this if you have issues downloading
files with proxy and/or security gateway.

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-13 12:52:50 +10:00
Allan McRae
64bd242863 alpm_fetch_pkgurl: fix memory leak
Signed-off-by: Allan McRae <allan@archlinux.org>
2017-01-04 15:32:14 +10:00