Libzypp/Failover

From openSUSE

Contents

Status quo

In a Google Summer of Code 2008 project, Gerard Farràs worked on an implementation of the below proposal. His implementation uses Metalinks as mirror list. (We decided to go for the Metalinks as the type of mirrorlist to use, rather then the textonly mirrorlists. This allowed to use an existing and powerful Metalink client, which already has all required features to get the job done.)

Status on openSUSE Factory

The new download behaviour, with aria2c as default transfer backend, has become the default in Factory at the 23th of February 2009. See

The new behaviour can be deactivated by exporting ZYPP_ARIA2C=0 in the environment, in case something breaks. The package aria2 is now pulled in automatically by package dependency.

This new stuff can also be tested on openSUSE 11.1 and 11.0 as described below.

When you run zypper or YaST with the new libzypp, it'll use aria2c to do the downloads, which uses Metalinks provided by http://download.opensuse.org and thus is very robust against failures.

/etc/zypp/zypp.conf has a few config options to tune the download behaviour (number of used connections and such).

If you run into problems with that, please report a bug. See below for known bugs. Thanks!

Testing on openSUSE 11.1

openSUSE 11.1 supported the first prototype of this download method, and it needed to be activated by setting the environment var ZYPP_ARIA2C=1. The package "aria2" needed to be installed.

It is however suggested to update libzypp from here: http://download.opensuse.org/repositories/zypp:/Head/openSUSE_11.1/zypp:Head.repo and aria2 from here: http://download.opensuse.org/repositories/network:/utilities/openSUSE_11.1/network:utilities.repo in order to get the newest versions.

Testing on openSUSE 11.0

On 11.0, you can test by updating libzypp, zypper, satsolver-tools and aria2 from this buildservice repository:

http://download.opensuse.org/repositories/home:/poeml:/zypp_11.0/openSUSE_11.0/home:poeml:zypp_11.0.repo That repository includes the aria2 package right away.

Known bugs

- Bug 478805 - an IPv6 problem which occurs when ncsd is not running, and a proxy is defined on "localhost". Workaround: start nscd, or use 127.0.0.1 instead of localhost in the proxy URL.



The Concept

This is a concept for libzypp doing failover when downloading packages from download.opensuse.org.

Mirrors can die at any time, without warning. That's just natural with a heterogenous farm of mirrors, spread around the world, and operated by 100 or more different organizations. download.opensuse.org, the central contact point for downloads, is not a world-spanning framework in itself, like Google, but is operated by a single organization. It can die, too.

But the download redirector that we use on download.opensuse.org offers some great features that can be complemented by a download client. One of the most important download clients being libzypp, the library that YaST and zypper are based on. One of the features is to provide to clients a dynamic mirror list, instead of suggesting one mirror that the client is redirected to, and make it succeed or fail. With such a mirror list, the client can do powerful things like failover (when a mirror has died, is not reachable, times out, ...), it could select the fastest mirror, or other things.

Here is a concept how this can be implemented.

Dynamic mirror lists

The following three types of mirror lists are available:

http://download.opensuse.org/distribution/10.3/repo/oss/suse/repodata/primary.xml.gz?mirrorlist

This is mostly for human beings and diagnostic purposes. It shows all mirrors that have the requested file. Note that only mirrors are shown that are known to have the requested file -- not potential mirrors that *could* have it.

http://download.opensuse.org/distribution/10.3/repo/oss/suse/repodata/primary.xml.gz.metalink

This is a more powerful format, which is known by "Metalink" clients. See [1] for information about Metalinks. If you look at the metalink file that download.opensuse.org generates, you'll see that it has a lot of useful information. Below, we'll explore the possibilities that arise from this.

Metalink files uses XML and looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<metalink version="3.0" xmlns="http://www.metalinker.org/" 
 origin="http://download.opensuse.org/distribution/11.0/repo/oss/suse/noarch/gbrainy-0.61-31.1.noarch.rpm.metalink" 
 generator="mod_zrkadlo Download Redirector - http://mirrorbrain.org/" type="dynamic" 
 pubdate="Tue, 22 Jul 2008 07:00:50 GMT"  
 refreshdate="Tue, 22 Jul 2008 07:00:50 GMT">
<publisher>
   <name>openSUSE</name>
   <url>http://download.opensuse.org</url>
 </publisher>
 <files>
   <file name="gbrainy-0.61-31.1.noarch.rpm">
     <size>129095</size>
     <verification>
       <hash type="md5">22e2b1a4054ede89c9d5b16ce5492d1e</hash>
       <hash type="sha1">f3771baa660594a90737fe336980ef73bf813858</hash>
       <pieces length="262144" type="sha1">
         <hash piece="0">f3771baa660594a90737fe336980ef73bf813858</hash>
       </pieces>
      </verification>
     <resources>
     <url type="http" location="es" preference="100">http://suse.bifi.unizar.es/opensuse/distribution/11.0/repo/oss/suse/noarch/gbrainy-0.61-31.1.noarch.rpm</url>
     <url type="http" location="se" preference="99">http://mirrors.se.eu.kernel.org/opensuse/distribution/11.0/repo/oss/suse/noarch/gbrainy-0.61-31.1.noarch.rpm</url>
     ... and so on... 
      

Clients that make use of this format are, for instance:

  • aria2c: A powerful command line implementation.
  • DownThemAll: A (platform-independent) Firefox extension.
  • libmetalink: A library written in C language.
  • And there several others.


The third type is a terse text-only format. (It could also be sent in XML, machine-parsable text file, JSON, or whatever. But a text file is likely the fastest and easiest to parse. It allows the client to simply ignore everything than the first line -- if nothing goes wrong.)

It looks like this:

# mirrorlist-txt version=1.0
# url baseurl_len mirrorid region:country power
http://ftp.hosteurope.de/mirror/ftp.opensuse.org/distribution/10.3/repo/oss/GPLv3.txt 49 22 EU:de 100
http://ftp5.gwdg.de/pub/opensuse/distribution/10.3/repo/oss/GPLv3.txt 33 44 EU:de 100
http://ftp.uni-ulm.de/mirrors/opensuse/distribution/10.3/repo/oss/GPLv3.txt 39 42 EU:de 100
...


The mirror lists are sorted by suitability for that particular client country. The most appropriate mirror (exactly the one that the redirector would otherwise choose to redirect the client to) is right at the top.

To try this out and get a feel for it, try a request like this:

curl -H "Accept: application/mirrorlist-txt" http://download.opensuse.org/distribution/10.3/repo/oss/suse/repodata/primary.xml.gz

This example introduces another goodie: transparent negotiation of the mirror list. A HTTP header sent along indicates the format(s) which the client could accept. See below for more about it.

How it works (Step 1)

The client will get this list by requesting the file like it normally would (http://host/path/to/file), but with the following request header added:

Accept: application/mirrorlist-txt

Thereby the client indicates its wish (and ability) to get a mirror list for the requested file.

If it makes sense for that file, the server will return the zypp mirrorlist. If so, it'll come with MIME type application/mirrorlist-txt so that it is distinguishable from other responses. If not, the server returns the requested file.

The client indicates its ability to all requests, and it will get a list each time, if the redirector decides so. This is important because

  • the redirector offers file-level granularity
  • we need that fine granularity
  • all requests can be the same, without "special" requests (like, say, append query string to URL -> get list -> pick mirror -> fetch file). Thus the client stays 100% compatible and HTTP compliant and works with different servers.

The client needs to be able to handle three possible cases:

  • 200 OK, Content-Type != application/mirrorlist-txt: receive the file. (Like today.)
  • 200 OK, Content-Type == application/mirrorlist-txt: follow first URL.
    • in case of failure, follow the second URL from the mirror list.
    • in case of failure, try next. And so on.
  • 302 Found: follow the Location header (standard redirect -- like happening today.)

At present, when a mirror doesn't work (times out, unexpectedly doesn't have a file, or returns something broken), the client will simply try again or give up. With this proposal, it'll transparently try other mirrors and will likely succeed.

How it works (Step 2)

So far, we assumed a healthy and reachable download.opensuse.org. But dealing with a non-reachable download.opensuse.org (or one returning garbage) is also possible. For that, the client should:

  • do all the above, and in addition save every base URL which it used into a little cache file. For "hard times" :-) That's why the mirror list doesn't only contain URLs, but also a number which indicates the length of the URL pointing to the "repository base".
  • in case of failing to get a reply from the redirector (timeout, garbage), the client uses one of the cached baseurls and try to get the file from there.

Thus, this enables the client to fall back to autonomously trying mirrors, and finding what it is looking for. This would be great, wouldn't it? Right now, the client gives up when download.opensuse.org is not available -- even though there are mirrors in abundance that it could fall back to.


How it works (Step 3 - nice to have)

In addition, the mirror lists (and the client-side little cache of base URLs and mirror ids) makes way for some other interesting features. Like:

  • optionally run a speed test and choose the fastest mirror
  • the client could download from multiple mirrors simultaneously

Moreover, the client could implement its own scheme of mirror selection. This could be "always use mirror XY if possible", or for instance. This is important to be configurable on the client side, because there is no reliable way for the server to find out -- not even by network prefix. Of course, the client could send its preferences along the request to the server -- but it is much easier to keep it on the client side, as there is no need to burden the server with this information.

I would not underestimate the desire of users to use a fixed, certain, preferred mirror, especially a local one!

Thus, the client could

  • prefer a certain mirror, every time it is among the listed mirrors
  • blacklist a certain mirror, so it is never tried
  • use the geographical coordinates of the mirrors (which could be included in the list) to implement a more sophisticated scheme which we can't do server wise. (There is no scalable and reliable way to know about the clients location, from the server side.) The client can know about its location, after having been configured with that data.

An additional idea for remote users with Internet connectivity (think Indonesia, New Zealand, ...): avoid the additional round-trip which occurs for each request, by optionally checking download.opensuse.org only once per refresh, and use mirror directly after that. This would use the "mirrorlist request" only once, on a crucial file in the repository, in order to locate mirrors which host this repository.

Performance considerations

For the server, serving mirror lists doesn't make a difference; generating the list is virtually as cheap as returning a redirect. The database query that we use is exactly the same.

On the client-side (which is waiting for network i/o) it won't matter if it needs to parse the list in addition. After all, it is very simple to just pick a mirror out of the list, which is all what the client needs to do, at a minimum.

In fact, it requires nothing else than reading the first line from the mirror list, and extracting the URL from it -- as long as nothing goes wrong. The mirror list is already sorted adequately. And the URL is the ready-made URL to be used.

Just like the client reads the server reply headers now, sees the Location: header, grabs the URL from that and follows it, it would read the mirror list, grab the URL from the first line, and follow it.

That's really all there is to it.

In case of an error then, it can simply try the next one.

Thus, performance won't be an issue.


Example scenarios

Not exhaustive.

Mirror breakage:

  • ftp5.gwdg.de is not available for 5 minutes, because a router is rebooted. (Happened this afternoon). After 1-3 minutes the redirection to that mirror is disabled, since our probing notices it. But in the time window before disabling, clients are sent there, and get timeouts. No amount of hardware redundancy on the server will solve this "three minute window". Nor will retrying. The above proposal will make the client seamlessly try the next mirror, and succeed.
  • Repository inconsistency introduced by various means, best described as "something messed up and now the md5sum of a patch file doesn't match". Some mirrors might have the old, broken file. Some mirrors might already have the new, repaired file. (This scenario can happen only if a kind of file is modified which is normally not supposed to change without a version number increase, and thus is normally safe to redirect to mirrors for.) An example is https://bugzilla.novell.com/show_bug.cgi?id=352719. The above proposal can make the client try another mirror and (if possible) retrieve the good file from elsewhere.
  • A German university hosting a mirror installs a new firewall. Its connection tracking is broken, which causes random connection resets for about 5% of downloaded files. The client does see HTTP 200 OK replies, but ends up with partial files that don't match the repository metadata. In this particular case, retrying to download from the same mirror would eventually succeed (due to the randomness of the problem), but retrying the download from another mirror would have succeeded immediately.

    (I'm not making this up. This scenario happened twice. The first time, it took me 3 weeks to debug it because there were only very vague reports. The second time it took only 3 hours. This bug must have caused lots of grief in 3 weeks.)
  • A university hosting a mirror switches from Apache httpd to another web server, e.g. BigFoot httpd. That server has broken support for range requests (byterange requests = parts of files that are requested by clients). The responses from the server have seemingly random corruption, caused by introduction of '\r\n' fragments into the data stream, leading to corrupted downloads for the client. A download via Metalinks that contain hashes for verification allow for transparent correction of the problem by redownloading the corrupted parts from another mirror. (The name of the web server was not BigFoot in reality; name changed with respect that software, to not put a stigma on it.)

There will always be breakage which can't be anticipated. Trying another mirror will help in many cases.

Server breakage:

  • When a download.o.o outage happens, clients can't check for security and other updates. With a cached list of previously used mirror base URLs, the client can use mirrors as fallback.
  • New installations are a different issue, because they don't have "previously used mirrors" yet. Either they retry later (to set up the security update repository, or other repositories), or we may be able to find another solution. Well, a hardcoded second redirector URL would obviously help, if we take it seriously. But it would obviously need a second redirector. The alternative could be to have the most current list of mirrors, in the time of release, included (hardcoded) in installation program that can be used instead of the one downloaded from redirector. Even if it is obsolete after some time, it is better choice than no mirror list at all.

It is important to understand that the fallback is really only a fallback, because mirrors are never 100% exact replicas. For some repositories this matters more (buildservice), for some less (10.3 repo). Security updates are somewhere in between. In the end, only download.o.o delivers files with exact caching control headers, making sure that a client doesn't see stale content. Content from any mirror could have been cached for arbitrary time. And stale content doesn't only mean "client doesn't see the update yet". It can also mean "those files don't fit together" and the client will complain about a (seemingly) broken repository.

Moreover, for security reasons it might be more reasonable to preconfigure clients only with one or two "trusted" mirrors, and not with the full list. It might be preferrable to delay updates on the client until the origin server is reachable again, rather than exposing the client to rogue mirrors. It is noteworthy that the origin server (download.opensuse.org) does not redirect for metadata and signatures to any mirror, so the clients always get those critical files from the origin site. If clients are using mirrors directly, the tradeoff is that they are less secure.