Nginx, Memcache, Drupal page cache #1

Reverse proxy caching is something that is almost a must have for any popular site today. For Drupalers Varnish is by far the most used reverse proxy since it's easy to use and works really well/stable. Nginx has been succesful as a reverse proxy with Drupal as well, but it has mainly been been used with the file system and modules like Boost. But there exists a Memcache module that can speak directly to Nginx as well. In this article I will do some benchmarking on how to save the data when applying memcache to Nginx.

tl;dr

If you use the Nginx Memcache module, compress using Drupals compression, set the header yourself and do revert cases on really old browsers. This should be the best solution for most cases.

Some basics

  • If you read from disk it will go slower then if you read from memory. Thus is memory based storage better then disk based storage for fast operations. Memcache is a popular memory storage.
  • Drupal can by default cache to database for anonymous users to save CPU, Memory etc and to give back the request faster. This means that everytime you enter the website as an anonymous users you get a presaved page served from the database.
  • If you read the cache from the database it will still invoke Drupal, meaning that the Webserver (Nginx), PHP and Database will be invoked. This is still sweaty for the system.
  • If you use a module like Boost, Drupal will instead cache to files. Nginx can check if files exists (with try_files) and serve them directly. Thus only Nginx is invoked. Much faster.

How to make stuff even faster

So basically what we want to do next is to do something like Boost, but read from the faster memory. This can be done by putting a dedicated reverse proxy like Varnish or a forward proxy pretending to be reverse proxy like Squid in front of Nginx. Varnish has been the choice of most persons I have seen working with the web, mainly because it is better at being a reverse-proxy and handles more pages in memory. I have run both on popular websites and can say hands down that Varnish is easier, faster and more stable. I should say though that I have not tried Squid in a couple of years so it might have improved (Squid and Lighttpd generally goes down in history as problem-makers for me).

The problem with having Varnish in front of Nginx though is that you have to learn multiple applications to get everything working. And while both of them has become a lot easier to use, it's still a hassle and a learning curve to learn both (I have, but for newbies). So what if you could connect Nginx to a memory storage? Well you can actually do that - with the Nginx memcached module (Please note that this is a Nginx add-on, not a Drupal module).

I know that this goes against the Unix Philosophy of writing programs that does one thing well. But I like my Nginx and even if it did my dishes, my bed and everything elase I would use it for that as well. It's easy, stable and fast. I never had any memory-leakage problems (I'm looking at you Lighty) or any performance issues on small sites (Apache?). It works really well. And people has done this before with great success.

How to save in memcache

So one downfall with the Memcache module is that you can't actually save to memory using it. This guy kind of solved that problem, but why let Nginx do the saving when Drupal can do it for you. And alas - you do not need to write a single row of code for this, because the Drupal community does not let you down. Nice guy Spleshka did the code for us in the form of the module Memcache Storage.

Another problem with caches in general is that they need to be purged when you want them to, even if the cache date has not been past yet. For instance if you edit a node title, you want that node to be updated for everyone and if it's shown on the first page for instance it should be updated there as well. Once again the Drupal community giveth (and not taketh my money) - the module Cache Expiration.

If you install these two modules and setup them up correctly in settings.php you will have stored pages in memcache!

(I might do a tutorial with all steps for this in the future)

To compress or not compress

So now that you have some background it's time for the question that this article is all about - should I compress, and if so how?

First off, one thing you can tick off right away - do not use the built in memcache PECL storage engine compression. This uses zlib compression which does not work (or not out of the box at least). You can of course set the content-encoding to use deflate and it will work, but don't.

When you choose to cache things via Drupal you have the option to compress or not to compress. But what compression is it using? Let us take a look at parts of drupal_page_set_cache:

 

function drupal_page_set_cache() {
[...]
    if ($cache->data['body']) {
      if ($page_compressed) {
        $cache->data['body'] = gzencode($cache->data['body'], 9, FORCE_GZIP);
      }
      cache_set($cache->cid, $cache->data, 'cache_page', $cache->expire);
    }
[...]
}

As you can you can see it is using the gzencode compression which is basic gzip compression in PHP. So basically it's storing in the format that most browsers now a day will accept for faster download speeds.

The other option is to save it as free text and then gzip it when it's being sent, like you usually do when gzip is on in Nginx.

Pros and cons

So with the preencoded page we will have the advantage of the memcache object being a lot smaller and thus faster to fetch and also the compression does not need to be added on the fly. Since Nginx only understands if content is gzipped or not based on the file extension, you have to either add the content-encoding header yourself or use a local location gzip_static always as well as add the expire time yourself.

The cons with this setup is that you have to do special use cases for older browsers that does not accept gzip or deflate. So basically you will have to check if the $http_accept_encoding is gzip;q=0 or empty and either do a gunzip or serve them the non-cached version.

This of course makes the opposite pro with using a free text. You just have your nginx gzip on and it knows what to do. Easy as it comes.

Benchmarking

So I tried both solutions actually on this website and ran an ab test against it on both occations to see if I could spot any difference. I also first ran it withour gzip on just so you can see the difference compression makes. I set the gzip compress level to 9 in nginx on the normal text case so it was equal to what Drupal does. I of course ran AB with the -H "Accept-Encoding gzip, deflate" so it would get gzip data when applicable.

Result on uncompressed data

img-responsive

Result of precompressed data

img-responsive

Result with nginx compressed data

img-responsive

Conclussion

First off - you should use gzip always when it's possible :)

Secondly, as you can see in the two tests it's faster to precompress the data. This is not so suprising since the only overhead that exists for this case is an extra check on the $http_accept_encoding and adding two headers. While in the second case you both have the overhead of loading an memcache object that is 3-4 times bigger and also compress it on the fly.

What was more interesting though was that the gzip compression in PHP was a little better at compressing then the one in nginx. This might be from file to file though. I've only tried one file.

I should also mention that I tried the test with all 9 different compression levels in Nginx, but the 9 was the fastest for this file.

Since not all servers are created equal it should also be mentioned that this was tested on a small server at Podnix (1 core, 2GB RAM) and was tested from a remote computer over internet.

Feedback

Did I do something wrong? Could I have done something different? Please let me know in the comments. I will come back with more posts about the Nginx+memcache solution soon.

The photo above was taken by Elliot Brown and is used under the CC 2.0 licence.