Scaling a download infrastructure with your success
When an organization offers files for download, and grows public, infrastructure might need to serve more requests than the single organization can technically handle. One option to scale is using paid services of a content delivery network (CDN). This is often not affordable, especially for Open Source projects. Another way is building up a mirror infrastructore. But choosing an appropriate mirror is often left to the user, and mirrors might be out of date, incomplete, or unreliable.
This talk shows how to build a poor man's CDN, using plain mirror servers and a redirecting Apache HTTP server. On the example of download.openSUSE.org, it shows how mirrored content can be transparently integrated into the web service, and how requests are redirected to geographically close mirrors using IP geolocation. The demonstrated infrastructure comprises means to scan mirrors and keeping track of present files in a MySQL database, active monitoring of mirrors, and mod_zrkadlo, a redirecting Apache module built upon the DBD framework.
Along the way, the talk discusses different approaches in geographical redirection of bulk data, load sharing with weighted randomization, high availibility and scaling issues, mirror stickiness, and client side robustness. Also discussed are problems that have been encountered in real life, solutions and open questions.