Tuesday, June 28, 2005

Portsnap and Squid

At BSDCan this year I listed to Kris Kennaway describe the FreeBSD package cluster (.pdf). He said he uses a caching Web proxy to optimize retrieval of source code when building packages. This makes an incredible amount of sense. Why download the same archive repeatedly from a remote site when you can download it once, and transparently let other clients retrieve the archive from the Web cache?

I decided I needed to emulate this sort of environment for several of my FreeBSD systems. I use Colin Percival's excellent portsnap to keep my FreeBSD ports tree up-to-date. If one of my systems retrieves the necessary updates through a Web cache, the other systems can get the same files from the Web cache. That saves Colin bandwidth and me time.

I set up Squid using the www/squid port. The only changes I made to the /usr/local/etc/squid/squid.conf file are listed below.

icp_port 0
acl our_networks src
http_access allow our_networks

The first line tells Squid to listen to the internal private interface of a dual-home VPN concentrator / firewall / gateway (CFG). I didn't want the external interface of the CFG offering a Web proxy to the world. I set the standard Squid port, 3128 TCP. The second line shuts down the ICP service on port 3130 UDP, since I don't use it. Line three sets up an ACL and defines the networks I allow to talk to the Squid proxy. I tell Squid to allow the IP addresses of remote systems connecting via IPSec tunnel (addressed in space), and systems on the internal network provided by the VPN CFG (addressed in space). The last line enables the ACL.

Before Squid can operate, I tell it to build its cache directories by running 'squid -z' as root.

Finally I edit my /etc/rc.conf file with this entry:


With this line added, I can use the /usr/local/etc/rc.d/squid.sh shell script to 'start' or 'stop' or 'restart' Squid. Prior to this I did not know that I needed to modify /etc/rc.conf before the squid.sh script would work.

Once Squid was running and listening on port 3128 TCP, I had a system ( use the proxy through portsnap. First I set the environment variable which tells fetch to use a proxy.

setenv http_proxy

When I ran 'portsnap fetch', I could see the program working through Squid to get the files it needed. Here is an excerpt from the /usr/local/squid/logs/access.log file.

1119988542.131 142 TCP_MISS/200 668 GET http://portsnap.daemonology.net/t/
- DIRECT/ text/plain

This is a TCP_MISS because this request is new to Squid.

Later I ran portsnap on two other systems, and, and saw different Squid results.

1119988702.146 95 TCP_MEM_HIT/200 677 GET http://portsnap.daemonology.net/t/
- NONE/- text/plain

1119989557.080 99 TCP_MEM_HIT/200 678 GET http://portsnap.daemonology.net/t/
- NONE/- text/plain

In both cases, Squid provided the request files from its cache, and didn't need to request the files from daemonology.net.

I plan to use this system to let remote sensors perform all of their updates through a central location, my VPN CFG system. That one box will run the Squid proxy and retrieve all necessary files once.


Jim Vanderveen said...

Excellent post, Richard! I run a FreeBSD shop using portsnap. Using squid to cache all the updates is perfect.

I've never set up squid before, so the detailed instructions will be a real timesaver for me. Thanks!

Mark Shearar said...

Exactly what I was looking for. Thanks Richard!

website design New York City said...
This comment has been removed by a blog administrator.