Caching with dynamic proxy classes

In my last post I used AspectJ to implement a cache that stored the returned result of methods with a special annotation (@Cachable). If you can’t use AspectJ you may want to use a dynamic proxy class: in this post I’ll present a solution for this.

You can download the Eclipse project as a tar or zip file or view the code online here.

Implementation

If you don’t know how dynamic proxy classes work here’s a short overview. Let’s say you want to do some extra work if the methods foo and bar from the class Tee are called. You would extract the methods into an interface and let Tee implement this interface.

Next, you’d implement a factory that produces a proxy instance for Tee with a custom InvocationHandler. This handler would have a look at the method’s name and check whether it’s foo or bar: you can now implement any extra actions in this handler.

You can also examine the annotations of the invoked method and that’s what I did: if the method has got the @Cachable annotation we’ll utilize a cache. But how do we know whether we can safely return an object from the cache?

Constructing a unique method identifier

This is crucial since we don’t want to return the same result from the cache if the method was called with different parameters. So we’ll have to add the values of the parameters to a identifier like so:

"package-name" + "class-name" + "method-name" + "param1-param2-[...]"

This way we’ll create a unique entry in the cache for different method calls.

How to

All we have to do is to add @Cachable to some methods:

public interface Foo {
  @Cachable
  public SomeObject foo(int param);
  @Cachable
  public AnotherObject bar(int param1, long param2);
}

Once we’ve done that we can use the factory to produce a new proxy instance with our custom InvocationHandler. The handler will use the cache, i.e. the method calls will return faster.

Conclusion

In this post I presented a simple solution for a cache that may speed up method calls. Although I recommend using AspectJ for this kind of job you can use dynamic proxy instances if your environment (in most cases read: your project leader) doesn’t permit you to use AspectJ.

Enhancing your privacy further with Squid and Tor

In my last post I described how to use Squid and Privoxy to enhance your privacy while surfing the internet. We want to push this a little bit further by adding onion routing with Tor. If you heard about Tor but don’t really know how it works, I suggest reading the Tor overview first.

Based on my last post we’ll build an even longer chain of proxies: a request from our browser is sent to Squid and handed to Privoxy which in turn will hand it to Tor. Then the request makes its way through the onion router network and finally reaches the web server. The performance of the onion router network may be bad and sometimes worse. Unless you really want to take care of your anonymity (who doesn’t ;) ) I suggest not using Tor. It’s not one of those things you just do for fun because it doesn’t hurt. If you’re paranoid enough and aren’t scared off to wait at best thirty seconds for one website to load then you should read on.

Configuring Tor

If you’re using a Debian system and aren’t on the unstable branch, you have to put the following into /etc/apt/sources.list:

deb http://ftp.debian.org/debian unstable main contrib non-free

Depending on your release you have to put this line into /etc/apt/apt.conf to prevent future updates coming from the unstable branch:

APT::Default-Release "testing";

After that you can use aptitude to install Tor:

aptitude update
aptitude install tor socat

Finally we need to tell Squid to forward requests to Privoxy, i.e. use it as a parent proxy. If you haven’t already done so add the following to squid.conf:

cache_peer localhost parent 8118 7 no-digest no-query
never_direct allow all

And Privoxy should forward to Tor. Put this into Privoxy’s config file:

forward-socks4a / 127.0.0.1:9050 .

That’s it. After starting Squid, Privoxy and Tor you’re ready to retrieve websites.

Torify everything

The problem with this setup is that it leaks DNS requests. I recommend reading the Torify Howto or the section Anonymizing various applications on Uwe Hermanns blog, if you’d like to get rid of this.

Enhancing your privacy using Squid and Privoxy

If you would like to surf the internet anonymously I’ll show you how to use Squid and Privoxy for this purpose. First we’ll configure Squid to filter some HTTP header fields. After this, web servers will most likely think that we aren’t requesting content through a proxy but rather directly with our browser. We will see that we can’t manipulate all HTTP header fields without running into problems: Privoxy will help us here.

You can test your setup with ProxyJudge or SamAir; there are a lot of other tools which provide this functionality. While SamAir just checks some HTTP header fields, ProxyJudge will do a more comprehensive check. It will calculate your level of anonymity: it ranges from 1 to 5 where level 1 is excellent and 5 bad. If you’re already using a proxy, your level of anonymity might be bad: go check it right now so you can compare the results later.

Configuring Squid

If you don’t want to use Privoxy you can still set some options in your squid.conf, which will get you up on level 1 or 2 at ProxyJudge. Here they are:

via off
forwarded_for off

header_access From deny all
header_access Server deny all
header_access WWW-Authenticate deny all
header_access Link deny all
header_access Cache-Control deny all
header_access Proxy-Connection deny all
header_access X-Cache deny all
header_access X-Cache-Lookup deny all
header_access Via deny all
header_access Forwarded-For deny all
header_access X-Forwarded-For deny all
header_access Pragma deny all
header_access Keep-Alive deny all

These directives control some HTTP header fields, which are set by Squid or another proxy if your Squid is part of a hierarchy of proxies. The Via and Forwarded-For fields are set to indicate that this request was forwarded by a proxy. This is something we don’t want, because this would leak the information that we’re using a proxy. Due to this reason the bunch of header_access lines deny some other fields too.

After you’ve done this you should have a rating of 1 or 2: you only get a 1 if you haven’t got reverse DNS enabled for your IP. More often than not this is something you can’t control but your ISP. If you don’t want every web server to know your current IP you can setup Squid to use another proxy as parent, e.g. a proxy provided by your ISP. Be aware that this might result in a bad rating, because the parent proxy might set the mentioned HTTP header fields and obviously you can’t change that.

So far this setup is highly effective, but I still recommend enabling Privoxy.

Configuring Privoxy

The advanced filtering capabilities of Privoxy can be used to mangle all different kind of things: web page content, cookies and disturbing internet junk like ads, pop-ups and banners. It is also possible to change some HTTP header fields. This is crucial: if we would have added these lines:

header_access Referer deny all
header_access User-Agent deny all

to the squid.conf, some websites wouldn’t function correctly, because they require these fields. If these fields aren’t set, parts of a website might not be displayed or you’re denied access completely. This is were Privoxy comes into play: you can set these two fields to whatever you want or let Privoxy decide this dynamically at runtime, e.g. it fakes the referrer to point to the requested website instead of revealing the page you really came from.

Now install Privoxy and change the following in its config file:

#debug 1
forward  /     proxy.isp.com:8080
forward  :443  .

The first line needs to be commented out or Privoxy would write every request to its logfile. The second and third line say that every request should be passed to this parent proxy and every HTTPS connection should be established directly with the foreign web server. These forward lines are read from top to bottom: the last line that matches will be used. If you don’t want to use a parent proxy at all you could just write:

forward  /     .

which says that requests should be made directly with the web servers.

Next we need to make some changes to the file named default.action. There is an action which matches all URLs and the following lines can be defined for it:

+hide-referrer{forge} \
+hide-user-agent{Mozilla/5.0}

While the first line defines thats the referrer should be forged to match the current website the second line sets the User-Agent field no matter what browser we’re using behind our proxy. You probably want to set the User-Agent to something different, e.g. if you’re using IE. I haven’t run into problems with this settings yet, though I’m using Firefox and Safari.

At this time Privoxy is ready to run and now all we need to do is to tell Squid to use Privoxy as a parent proxy:

cache_peer localhost parent 8118 7 no-digest no-query

We built a chain of proxies: first our request goes to Squid which in turn hands it to Privoxy. You might ask why we bothered to setup Squid at all. Shouldn’t it be sufficient just to use Privoxy? This highly depends on the features you’d like to have: a sophisticated cache and the possibility of a transparent proxy are strong reasons for Squid. If you’re somewhere just with your notebook, e.g. in some office, you might want to opt for Privoxy without a Squid, because there may be already a proxy and you just want to obfuscate the requests you make to that proxy.

Tuning Privoxy

In the default installation of Privoxy on Debian systems there are a lot of other filters enabled, which remove ads and the like. All this content filtering can slow things down and use a good deal of processing time, i.e. massive CPU usage. I recommend turning off the filtering in Privoxy and suggest using a Firefox plugin like Adblock Plus.

If you’d like to disable the filtering done by Privoxy, change the following in the config file: comment out all lines starting with

  • actionsfile except:
    actionsfile default
  • filterfile

In the file default.action:

  • There’s a block matching all URLs. Delete all filter lines.
  • Comment out everything below:
    +add-header{X-Actions-File-Version: 1.8}

That’s it. Now Privoxy will run a lot faster.

Tuning and hardening Squid

Tuning and hardening Squid will be the topic of this post, where tuning means making it a little bit faster and hardening means less vulnerable to malicious use. The default installation of Squid on a Debian box has a lot of features enabled which most likely aren’t used: we want to turn these off. Then there might be situations where you probably want to use Squid but don’t want it to function as a cache: we’ll investigate this too.

This post is geared towards my next post: it’s about a tiny router which hasn’t got a great disk or plenty of RAM. So I’m not going to discuss the pros and cons of various filesystems, using a RAID or having enough filedescriptors. If you’d like to read about that go here, here or here.

Tuning

Tuning Squid will speed things up a little bit. So without further ado lets first take a look a the directives for the squid.conf:

pipeline_prefetch on
shutdown_lifetime 1 second

While pipeline_prefetch will boost the performance of pipelined requests to closer match that of a non-proxied environment, the second directive shutdown_lifetime saves you a lot of time waiting for Squid to shut down. The latter comes in very handy if you’re tweaking Squid and need to restart it a lot.

Even though Squid is meant as a cache there are reasons running it without a cache, i.e. as a pure forwarding proxy: you might want to use it as a load balancer with some parent proxies, simply as a transparent proxy or you don’t have particularly fast hardware. There are two methods to circumvent caching:

  1. Deny caching for all connections:
    acl all src 0.0.0.0/0.0.0.0
    no_cache deny all

    This way neither a request will be satisfied from the cache nor the reply will be cached. Note that the first line might already be in your configuration.

  2. If you use a parent proxy you can specify the proxy-only option to prevent that retrieved data from the remote cache is stored locally. An example:
    cache_peer proxy.isp.com parent 8080 0 proxy-only

Finally you might want to turn off logging. On a Debian based system it’s sufficient to turn of cache_access_log and cache_store_log:

cache_access_log none
cache_store_log none

Hardening

When talking about hardening I think about turning off features that aren’t used and restricting access to the proxy. Features that aren’t used might be ICP and HTCP: they are used to communicate with other caches in a hierarchy. In most cases we don’t need this:

icp_port 0
htcp_port 0
icp_access deny all
htcp_access deny all

If you don’t wish to use SNMP we can disable this too. This is already the default for systems running Debian.

snmp_port 0
snmp_access deny all

At last you definitely want to restrict access to your proxy: define an access control list (acl) and either allow or deny access with http_access. Lets say your LAN is 172.16.0.0/24 and 172.16.1.0/24. Then you would put the following into squid.conf:

acl LAN src 172.16.0.0/24 172.16.1.0/24
http_access allow LAN

If somebody outside your network tries to access your proxy he’ll get an error message that he isn’t allowed to do so.