Tutorial: Transparent Proxying

Ordinarily, when using Squid on a network to cache web traffic, browsers must be configured to use the Squid system as a proxy. This type of configuration is known as traditional proxying. In many environments, this is simply not an acceptable method of implementation. Therefore Squid provides a method to operate transparently, which means users do not even need to be aware that a proxy is in place. Web traffic is redirected from port 80 to the port where Squid resides, and Squid acts like a standard web server for the browser.

Using Squid transparently is a two part process, requiring first that Squid be configured properly to accept non-proxy requests, and second that web traffic gets redirected to the Squid port. The first part of configuration is performed in the Squid module, while the second part can be performed in the excellent third party IPChains module by Tim Niemueller, or from the command line using ipchains. That is, assuming you are using Linux, otherwise you should consult the Squid FAQ Transparent Caching/Proxying entry.

Configuring Squid for Transparency

In order for Squid to operate as a transparent proxy, it must be configured to accept normal web requests rather than (or in addition to) proxy requests. Here, we'll discuss this part of the process, explaining both the console configuration and the Webmin configuration. Console configuration is explained, and Webmin configuration is shown in the figure below.

As root, open the squid.conf file in your favorite text editor. This file will be located in one of a few different locations depending on your operating system and the method of installation. Usually it is found in either /usr/local/squid/etc, when installed from source, or /etc/squid, on Red Hat style systems. First we'll take note of the http_port option. This tells us what port Squid will listen on. By default, this is port 3128, but you may change it if you need to for some reason. Next you should configure the following options, as shown:

httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy  on
httpd_accel_uses_host_header on

These options, as described in the Miscellaneous Options section of this document, configures Squid as follows. httpd_accel_host virtual causes Squid to act as an accelerator for any number of web servers, meaning that Squid will use the request header information to figure out what server the user wants to access, and that Squid will behave as a web server when dealing with the client. httpd_accel_port 80 configures Squid to send out requests to origin servers on port 80, even though it may be receiving requests on another port, 3128 for example. httpd_accel_with_proxy on allows you to continue using Squid as a traditional proxy as well as a transparent proxy. This isn't always necessary, but it does make testing a lot easier when you are trying to get transparency working, which is discussed a bit more later in the troubleshooting section. Finally, httpd_accel_uses_host_header on tells Squid that it should figure out what server to fetch content from based on the host name found in the header. This option must be configured this way for transparency.

IPChains Configuration For Transparent Proxying

The IPChains portion of our transparent configuration is equally simple. The goal is to hijack all outgoing network traffic that is on the www port (that's port 80, to be numerical about it). IPChains, in its incredible power and flexibility allows you to do this with a single command line or a single rule. Again, the configuration is shown and discussed for both the Webmin interface and the console configuration.

# /sbin/ipchains -I input 1 -s 192.168.1.0/24 -d 0/0 80 -p tcp -i eth0 -j REDIRECT 3128

While a detailed description of the ipchains tool is beyond the scope of this section, it should briefly be explained what is happening in this configuration. First, we are inserting a rule into the first input chain position, with the -I input 1 portion of the command. Next we're defining whose requests will be acted upon, in this case ipchains will work on all packets originating from the 192.168.1.0/24 network. This is defined by the -s 192.168.1.0/24 portion of the rule. Next the destination is defined to be anywhere on the network, and everything on port 80, with the -d 0/0 80 portion. Then comes the choice of protocol to act upon, here we've chosen TCP with the -p tcp section. The interface to act upon is then configured with -i eth0. Finally, ipchains is told what to do with packets that match the prior defined criteria.