Enhancing your privacy further with Squid and Tor

In my last post I described how to use Squid and Privoxy to enhance your privacy while surfing the internet. We want to push this a little bit further by adding onion routing with Tor. If you heard about Tor but don’t really know how it works, I suggest reading the Tor overview first.

Based on my last post we’ll build an even longer chain of proxies: a request from our browser is sent to Squid and handed to Privoxy which in turn will hand it to Tor. Then the request makes its way through the onion router network and finally reaches the web server. The performance of the onion router network may be bad and sometimes worse. Unless you really want to take care of your anonymity (who doesn’t ;) ) I suggest not using Tor. It’s not one of those things you just do for fun because it doesn’t hurt. If you’re paranoid enough and aren’t scared off to wait at best thirty seconds for one website to load then you should read on.

Configuring Tor

If you’re using a Debian system and aren’t on the unstable branch, you have to put the following into /etc/apt/sources.list:

deb http://ftp.debian.org/debian unstable main contrib non-free

Depending on your release you have to put this line into /etc/apt/apt.conf to prevent future updates coming from the unstable branch:

APT::Default-Release "testing";

After that you can use aptitude to install Tor:

aptitude update
aptitude install tor socat

Finally we need to tell Squid to forward requests to Privoxy, i.e. use it as a parent proxy. If you haven’t already done so add the following to squid.conf:

cache_peer localhost parent 8118 7 no-digest no-query
never_direct allow all

And Privoxy should forward to Tor. Put this into Privoxy’s config file:

forward-socks4a / 127.0.0.1:9050 .

That’s it. After starting Squid, Privoxy and Tor you’re ready to retrieve websites.

Torify everything

The problem with this setup is that it leaks DNS requests. I recommend reading the Torify Howto or the section Anonymizing various applications on Uwe Hermanns blog, if you’d like to get rid of this.

Enhancing your privacy using Squid and Privoxy

If you would like to surf the internet anonymously I’ll show you how to use Squid and Privoxy for this purpose. First we’ll configure Squid to filter some HTTP header fields. After this, web servers will most likely think that we aren’t requesting content through a proxy but rather directly with our browser. We will see that we can’t manipulate all HTTP header fields without running into problems: Privoxy will help us here.

You can test your setup with ProxyJudge or SamAir; there are a lot of other tools which provide this functionality. While SamAir just checks some HTTP header fields, ProxyJudge will do a more comprehensive check. It will calculate your level of anonymity: it ranges from 1 to 5 where level 1 is excellent and 5 bad. If you’re already using a proxy, your level of anonymity might be bad: go check it right now so you can compare the results later.

Configuring Squid

If you don’t want to use Privoxy you can still set some options in your squid.conf, which will get you up on level 1 or 2 at ProxyJudge. Here they are:

via off
forwarded_for off

header_access From deny all
header_access Server deny all
header_access WWW-Authenticate deny all
header_access Link deny all
header_access Cache-Control deny all
header_access Proxy-Connection deny all
header_access X-Cache deny all
header_access X-Cache-Lookup deny all
header_access Via deny all
header_access Forwarded-For deny all
header_access X-Forwarded-For deny all
header_access Pragma deny all
header_access Keep-Alive deny all

These directives control some HTTP header fields, which are set by Squid or another proxy if your Squid is part of a hierarchy of proxies. The Via and Forwarded-For fields are set to indicate that this request was forwarded by a proxy. This is something we don’t want, because this would leak the information that we’re using a proxy. Due to this reason the bunch of header_access lines deny some other fields too.

After you’ve done this you should have a rating of 1 or 2: you only get a 1 if you haven’t got reverse DNS enabled for your IP. More often than not this is something you can’t control but your ISP. If you don’t want every web server to know your current IP you can setup Squid to use another proxy as parent, e.g. a proxy provided by your ISP. Be aware that this might result in a bad rating, because the parent proxy might set the mentioned HTTP header fields and obviously you can’t change that.

So far this setup is highly effective, but I still recommend enabling Privoxy.

Configuring Privoxy

The advanced filtering capabilities of Privoxy can be used to mangle all different kind of things: web page content, cookies and disturbing internet junk like ads, pop-ups and banners. It is also possible to change some HTTP header fields. This is crucial: if we would have added these lines:

header_access Referer deny all
header_access User-Agent deny all

to the squid.conf, some websites wouldn’t function correctly, because they require these fields. If these fields aren’t set, parts of a website might not be displayed or you’re denied access completely. This is were Privoxy comes into play: you can set these two fields to whatever you want or let Privoxy decide this dynamically at runtime, e.g. it fakes the referrer to point to the requested website instead of revealing the page you really came from.

Now install Privoxy and change the following in its config file:

#debug 1
forward  /     proxy.isp.com:8080
forward  :443  .

The first line needs to be commented out or Privoxy would write every request to its logfile. The second and third line say that every request should be passed to this parent proxy and every HTTPS connection should be established directly with the foreign web server. These forward lines are read from top to bottom: the last line that matches will be used. If you don’t want to use a parent proxy at all you could just write:

forward  /     .

which says that requests should be made directly with the web servers.

Next we need to make some changes to the file named default.action. There is an action which matches all URLs and the following lines can be defined for it:

+hide-referrer{forge} \
+hide-user-agent{Mozilla/5.0}

While the first line defines thats the referrer should be forged to match the current website the second line sets the User-Agent field no matter what browser we’re using behind our proxy. You probably want to set the User-Agent to something different, e.g. if you’re using IE. I haven’t run into problems with this settings yet, though I’m using Firefox and Safari.

At this time Privoxy is ready to run and now all we need to do is to tell Squid to use Privoxy as a parent proxy:

cache_peer localhost parent 8118 7 no-digest no-query

We built a chain of proxies: first our request goes to Squid which in turn hands it to Privoxy. You might ask why we bothered to setup Squid at all. Shouldn’t it be sufficient just to use Privoxy? This highly depends on the features you’d like to have: a sophisticated cache and the possibility of a transparent proxy are strong reasons for Squid. If you’re somewhere just with your notebook, e.g. in some office, you might want to opt for Privoxy without a Squid, because there may be already a proxy and you just want to obfuscate the requests you make to that proxy.

Tuning Privoxy

In the default installation of Privoxy on Debian systems there are a lot of other filters enabled, which remove ads and the like. All this content filtering can slow things down and use a good deal of processing time, i.e. massive CPU usage. I recommend turning off the filtering in Privoxy and suggest using a Firefox plugin like Adblock Plus.

If you’d like to disable the filtering done by Privoxy, change the following in the config file: comment out all lines starting with

  • actionsfile except:
    actionsfile default
  • filterfile

In the file default.action:

  • There’s a block matching all URLs. Delete all filter lines.
  • Comment out everything below:
    +add-header{X-Actions-File-Version: 1.8}

That’s it. Now Privoxy will run a lot faster.

Tuning and hardening Squid

Tuning and hardening Squid will be the topic of this post, where tuning means making it a little bit faster and hardening means less vulnerable to malicious use. The default installation of Squid on a Debian box has a lot of features enabled which most likely aren’t used: we want to turn these off. Then there might be situations where you probably want to use Squid but don’t want it to function as a cache: we’ll investigate this too.

This post is geared towards my next post: it’s about a tiny router which hasn’t got a great disk or plenty of RAM. So I’m not going to discuss the pros and cons of various filesystems, using a RAID or having enough filedescriptors. If you’d like to read about that go here, here or here.

Tuning

Tuning Squid will speed things up a little bit. So without further ado lets first take a look a the directives for the squid.conf:

pipeline_prefetch on
shutdown_lifetime 1 second

While pipeline_prefetch will boost the performance of pipelined requests to closer match that of a non-proxied environment, the second directive shutdown_lifetime saves you a lot of time waiting for Squid to shut down. The latter comes in very handy if you’re tweaking Squid and need to restart it a lot.

Even though Squid is meant as a cache there are reasons running it without a cache, i.e. as a pure forwarding proxy: you might want to use it as a load balancer with some parent proxies, simply as a transparent proxy or you don’t have particularly fast hardware. There are two methods to circumvent caching:

  1. Deny caching for all connections:
    acl all src 0.0.0.0/0.0.0.0
    no_cache deny all

    This way neither a request will be satisfied from the cache nor the reply will be cached. Note that the first line might already be in your configuration.

  2. If you use a parent proxy you can specify the proxy-only option to prevent that retrieved data from the remote cache is stored locally. An example:
    cache_peer proxy.isp.com parent 8080 0 proxy-only

Finally you might want to turn off logging. On a Debian based system it’s sufficient to turn of cache_access_log and cache_store_log:

cache_access_log none
cache_store_log none

Hardening

When talking about hardening I think about turning off features that aren’t used and restricting access to the proxy. Features that aren’t used might be ICP and HTCP: they are used to communicate with other caches in a hierarchy. In most cases we don’t need this:

icp_port 0
htcp_port 0
icp_access deny all
htcp_access deny all

If you don’t wish to use SNMP we can disable this too. This is already the default for systems running Debian.

snmp_port 0
snmp_access deny all

At last you definitely want to restrict access to your proxy: define an access control list (acl) and either allow or deny access with http_access. Lets say your LAN is 172.16.0.0/24 and 172.16.1.0/24. Then you would put the following into squid.conf:

acl LAN src 172.16.0.0/24 172.16.1.0/24
http_access allow LAN

If somebody outside your network tries to access your proxy he’ll get an error message that he isn’t allowed to do so.

Using a parent proxy with Squid

If you want Squid to be part of a hierarchy of proxies or you just want Squid to fetch content not directly from a web server but rather indirectly from another proxy then read on how to do that.

You can use the cache_peer directive to add parent proxies which Squid will ask for content. Furthermore you can control whether content will be fetched directly or indirectly with always_direct or never_direct respectively. For example

cache_peer proxy.some-isp.com parent 8080 0 no-query no-digest
never_direct allow all

would tell Squid to always fetch content from the parent proxy, which is located at proxy.some-isp.com:8080. If we wouldn’t use the second directive there may be certain circumstances where Squid would ask directly for content and would ignore the parent proxy; this isn’t what we want.

There are a lot of options available which I don’t want to discuss here, because they are very well documented, but no-query and no-digest say that no ICP requests or cache digests should be send to the parent proxy (read: nagging should be turned off ;) ).

Multiple parent proxies

If you would like to have more than one parent proxy you can add more cache_peer directives; one for each parent. Now you can define either weight or round-robin to control the way Squid will communicate with the proxies: while weight tells Squid to prefer one cache over another, round-robin tries to spread connections evenly among the defined caches.

First of all a simple example for two parent proxies:

cache_peer proxy.isp1.com parent 8080 0 no-query no-digest default
cache_peer proxy.isp2.com parent 8080 0 no-query no-digest

If you define more than one parent proxy you might want to set one as the default proxy, which is used as a last resort.

An example for weight:

cache_peer proxy.isp1.com parent 8080 0 no-query no-digest weight=1
cache_peer proxy.isp2.com parent 8080 0 no-query no-digest weight=2

In this example it is likely that the proxy from the second ISP will be favored over the first one.

And here an example for round-robin:

cache_peer proxy.isp1.com parent 8080 round-robin no-query
cache_peer proxy.isp2.com parent 8080 round-robin no-query
cache_peer proxy.isp3.com parent 8080 round-robin no-query

All connections to our proxy would be round-robined among these three caches. Because Squid treats all parents equally, it is currently not possible to define a weight here, e.g. to forward 50% of the requests to the first proxy and 25% to the second and third proxy respectively.

Transparent proxy with Squid

If you ever wanted to know how to setup a transparent proxy with Squid, because either you are just curious or you have more than one computer from which you want to surf the internet, and you don’t want to set the proxy manually, then this might be something for you.

I will assume that we have two networks: 172.16.0.0/24 and 172.16.1.0/24. These networks are connected to our router via eth1 and eth2 respectively, while the router itself has the IPs 172.16.0.1 and 172.16.1.1.

Setup

If you’re running Debian you have a newer or older version of Squid installed, depending on the release you chose. At the time of this writing that might be 2.5.9 for Sarge aka stable or 2.6.5 for Etch aka testing. If you don’t already know the version, check your version with dpkg -l squid.

For Squid 2.5.xx put the following into /etc/squid/squid.conf:

http_port 172.16.0.1:3128
http_port 172.16.1.1:3128
httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on

and for Squid 2.6.xx:

http_port 172.16.0.1:3128 transparent
http_port 172.16.1.1:3128 transparent
always_direct allow all

The last directive was a workaround for early 2.6 versions of Squid, because of this bug. With the current version it seems that this isn’t a problem anymore, so just leave it out.

Firewall

We have to tweak the firewall, so that everybody who wants to surf the internet will go through our transparent proxy.

iptables -t nat -A PREROUTING -i eth1 -p tcp ! -d 172.16.0.0/24
         --dport 80 -j REDIRECT --to-port 3128
iptables -t nat -A PREROUTING -i eth2 -p tcp ! -d 172.16.1.0/24
         --dport 80 -j REDIRECT --to-port 3128

Access Control

At last we want to allow the users in our network to connect to our proxy. In the squid.conf there is a line saying:

http_access allow localhost

below that line you should put the following:

acl LAN src 172.16.0.0/24 172.16.1.0/24
http_access allow LAN

That’s it.