At work, I now have around 50 desktops running Ubuntu and around 40 servers (including customers machines) also running Ubuntu. As you can imagine, when you have a security update of X, this represents a lot of bandwith usage! Not to speak about Hardy upgrade! We started to look at different solutions to optimize our precious bandwith.
Some search gave:
- local mirror: huch… this is a bit too much for us 🙂
- squid usage: good, but you need to tweak too much your squid installation to keep your .deb inside the pool. And squid can make .deb expires even they are still valid.
- apt-proxy/apt-cacher/apt-cacher-ng: all looks good but… you have to modify your client configuration. As I am lazy, I don’t want to do that (and also, because I have mobile users who only want to use the cache when they are on the corporate network). Between the three, I chose apt-cacher, just based on some reading on the web… Other may be as good as apt-cacher!
We selected the association: squid + apt-cacher + jesred. Let’s have a look on each component:
- apt-cacher: .deb and Packages/Source cache. You can also import data from another source (for exemple from a cd-rom).
- squid: THE proxy, we use it as a transparent proxy in our case.
- jesred: rewrite squid URL and redirect access to the Ubuntu archive to apt-cacher.
The installation described below was made on a Ubuntu 8.04. The machine is a Xen virtual machine (I’ll talk about Xen another time ;-)). All the softwares are taken from Ubuntu repositories: squid beeing in main, other packages are in universe (ensure universe is enabled). Installation and configuration is really easy!
squid installation
# apt-get install squid
squid configuration
Edit /etc/squid.conf and add in ACL definititions:
acl mylan src 10.0.0.0/255.255.0.0
Allow traffic from you network:
http_access allow mylan
You can now test your squid. It should be operational.
apt-cacher installation
# apt-get install apt-cacher
I just changed the admin_email value in /etc/apt-cacher/apt-cacher.conf
As a quick test, set http_proxy env value and try to use apt. Everything should go throught the cache (check the logs).
jesred installation
# apt-get install jesred
jesred configuration
Edit /etc/jesred.acl
to authorize your network (just add you lan at the end of the file).
Edit /etc/jesred.rules
and add:
regex ^http://((.*)archive.ubuntu.com/ubuntu/(dists|pool)/.*)$Â Â Â http://localhost:3142/\1
regex ^http://(security.ubuntu.com/ubuntu/(dists|pool)/.*)$Â Â Â http://localhost:3142/\1
I have also added two aborts in order to use upgrade-manager:
abort .gpg
abort ReleaseAnnouncement
Last but not least, the glue between all the elements:
Edit /etc/squid.conf
and add:
redirect_program /usr/lib/squid/jesred
Finished ! Now your squid redirect all requests to *archive.ubuntu.com and security.ubuntu.com to apt-cacher. Happy installation / upgrades!
After implementing the above configuration, I seem to be getting a lot of corruption errors:
Wed Oct 22 11:53:18 2008|127.0.0.1| ALARM! /var/cache/apt-cacher/packages/archive.ubuntu.com_ubuntu_dists_intrepid_universe_binary-i386_Packages.bz2 file size mismatch (found 212992, expected 4541620). Renaming to /var/cache/apt-cacher/packages/archive.ubuntu.com_ubuntu_dists_intrepid_universe_binary-i386_Packages.bz2.corrupted.
It seems that one solution is the limit the number of redirector processes to spawn from squid with the following configuration:
redirect_children 1
You don’t need jesred.
Something like the following in squid.conf does the trick just fine (substitute the hostname of your apt cache for “aptcacher” everwhere):
cache_peer aptcacher parent 3142 7 proxy-only no-query no-netdb-exchange connect-timeout=15
acl aptget browser -i apt-get apt-http apt-cacher apt-proxy
acl deburl urlpath_regex /(Packages|Sources|Release|Translations-.*)\(.(gpg|gz|bz2))?$ /pool/.*/\.deb$ /(Sources|Packages)\.diff/ /dists/[^/]*/[^/]*/(binary-.*|source)/.
cache_peer_access aptcacher allow aptget
cache_peer_access aptcacher allow deburl
cache_peer_access aptcacher deny all
never_direct allow aptget
never_direct allow deburl
never_direct deny all
The urlpath_regex may be too broad and/or too narrow and/or unnecessary, based on your exact needs.
FWIW, my experience with apt-cacher-ng has been a lot better than with apt-cacher (which used a lot of resources in addition to being slow and unstable).
Hi Andras,
Thanks for your comment. I’m not a squid expert, but the rules you have given does not make the job: you are catching all repositories (maybe we can check more than urlpath_regex). But looks promising.
For apt-cacher-ng, I’ll give a try. Lot of comments seems to indicate this is a good competitor. I have to admit that I did not had problem with apt-cacher on my Ubuntu hardy, and as it is used by the whole company, I haven’t touch it since.
The ulrpath_regex acl is really only intended for non-apt-clients, such as ordinary browsers, that browse repositories. You can of course remove the requirement for .debs to be under a pool/ directory to make the matching broader, for example.
APT and its ilk should be detected by the “aptget” browser acl and thus redirected to the apt proxy.
BTW, with the appropriate refresh_pattern (e.g. by matching Packages and .deb and giving them high expiry times) you may be able to dispense with a dedicated apt proxy entirely as squid will happily cache the package files itself.
Thanks Andras Korn. I have created a How-to based on the rules you gave on my website called “Using apt-cacher-ng to handle deb files instead of squid” (http://portablejim.site-hosts.net/tips/95-squidandaptcacherng.html)
Thanks so much for the article. Great.
Bandwith optimization: squid, apt-cacher and jesred…
At work, I now have around 50 desktops running Ubuntu and around 40 servers (including customers machines) also running Ubuntu. As you can imagine, when you have a security update of X, this represents a lot of bandwith usage! Not to speak about Hardy …
Bandwith optimization squid apt cacher and jesred.. Keen 🙂
Hello would you mind stating which blog platform you’re working with?
I’m planning to start my own blog in the near future but I’m
having a tough time choosing between BlogEngine/Wordpress/B2evolution and Drupal.
The reason I ask is because your layout seems different then most blogs and I’m
looking for something unique. P.S My
apologies for getting off-topic but I had to ask!