Reficio™ - Reestablish your software!

Minimize the response time of your blog to less than a second!

Speaking about blogs. I have noticed recently that there is one thing that I like more than blogging… it’s configuring and tweaking my blog. And what I mean here is not playing with themes, images, fonts or styles. Don’t get me wrong, I wanted my blog to look neat and slick on a plethora of browsers and mobile devices. I have simply left this part to professionals and bought a commercial theme that matched my expectations. What I meant here was doing the real man’s job of installation, configuration and maintenance. And you may ask yourself why the hell should you do it yourself if there’s a wide choice of platforms where after signing-up you can configure your shiny blog in a few seconds. That’s a reasonable question I have to admit… but doing it yourself IS FUN! OK, if you are not an IT freak it’s probably boring and tiresome, but in that case you shouldn’t be a visitor of my blog. Coming back to the topic though. Not only is that fun, but also profitable. I know, configuring Apache is not Java/Scala/Groovy programming, but it’s really beneficial sometimes to leave the everyday playground and teach yourself something new. Apache, virtual hosts, proxies, reverse proxies, rewrite rules, content substitution, caching, etc. – these are the topics that every respectable software engineer has to be at least acquainted with. It does not necessarily mean that you have to know all the config options by heart. However, if you don’t know what features are available there you are lost. I can guarantee that this knowledge can be applied to any web project that you currently work on. There’s also nothing better than a confused face of a Linux admin when you teach him how to do stuff, but that’s the next, advanced level. I have spent many hours hacking Linux stuff, and I have never regretted it as it simply helped me to developed my skills and become a better software engineer.

THE GOAL

OK. So my goal was to configure a low-cost WordPress blog with a response time of less than a second. I have chosen WordPress, but most of the stuff that I mention hear can be used with any blogging/CMS/website platform. I fixed my budget to around 99 USD a year, which was pretty tight, but let’s go for it! The second requirement was not to touch the PHP code at all. At first, it does not seem reasonable, as you may think that code optimizations are necessary in order to achieve such performance. It’s not true though, and if you customize code WordPress upgrades turn out to be a nightmare. You have to apply your changes again and again and again and, guess what, the code changes all the time. Next, I wanted to use as little WordPress plugins as possible. Why’s that, you are probably asking for the second time? So, with many plugins you easily run into compatibility issues when a certain version of a plugin does not work with some other plugin. You do an upgrade then and most of stuff stops working – I have already seen that. What is more, plugins are normally tested against a plain WordPress installation, so it’s highly possible to get an unexpected behavior when one plugin alters the functionality of another one – in an undefined way of course. I wanted to make it really simple. A really plain instance of WordPress that is lightening fast and easily upgradeable would do the trick!

HOSTING

Having bad experience with a shared-hosting I decided to find the cheapest VPS with an automatic backup-restore procedure that I can get. VPS will perform better in most of the cases and, what it’s important, you have the access to the SSH console. I tried a few options and decide to buy a 12-month subscription at biznes-host.pl. I got unlimited transfer (10Mb link), 512 MB of guaranteed RAM (1024 maximal), 2GHz processor, 10GB of disk storage and the Debian Squeeze OS on the top of it. Cost: around 50 USD a year!

WORDPRESS

How to install WordPress is not the topic here so I will redirect you to this site: WordPress on Debian It’s basically a mechanical thing, nothing fancy.

THEME + CONTENT

Here you can choose whatever you want. My requirement was to have a “responsive” theme that would graphically scale well and look good on various resolutions and mobile devices. I paid attention that the theme is “lightweight” so that the download footprint stays reasonable low.
Having the theme set up I configured the “About me” section and created a sample blog entry with a photo. Next, I opened the www.reficio.org link for the first time. It took about 7 seconds to fully load the page. All measurements were performed by gtmetrix.com Obviously, there was a space for improvement.

OPTIMIZATION 1 – GZIP COMPRESSION

Gtmetrix.com is really useful as it points out what could be improved to decrease the response time of your site. The first hint was to enable the gzip compression. I did not want to install any plugins for that though, so I decided to keep it simple and stupid (KISS). I simply enabled Apache’s mod_deflate.

Default settings fully satisfied me (cat /etc/apache2/mods-available/deflate.conf):

OPTIMIZATION 2 – HTTP EXPIRES HEADERS

Second thing was to enable browser caching of the static content. Nothing simpler with Apache2! Just invoke the following commands:

Then, go to your WordPress folder (in my case /var/www/wordpress) and edit the .htaccess file adding the following lines:

OPTIMIZATION 3 – IMAGES

Now it is time to optimize the size and quality of all media files. Smush.it offers an API that performs these optimizations automatically, and there is a plugin that seamlessly integrates Smush.it with WordPress. Simply install the plugin and go to the “Media Library” where you can invoke the “Bulk Smush.it” action.

Your images will losslessly shrink by at least 40% which means a smaller page size and a faster load time!

OPTIMIZATION 4 – CACHING

We have done a lot so far, but there’s more juice coming in. Let’s have a look how we can improve the general performance of our site. So, how often do you post a new blog entry? If you are not running “the world news blog” the page does not change every minute – what’s a perfect case for caching. The best option would be to cache generated HTML pages and rewrite requests to a folder with the cached content. It would give the PHP stack a brake and would not it the database every time the page is rendered. It sounds complicated, but we can easily do all of that with the “WP Super Cache plugin”. Install and enable the plugin, go to the “Settings” -> “WP Super Cache” site and open the “Advanced” tab. Then select options as shown in the picture below (remember not to enable the gzip compression as we have already enabled mod_deflate):

Then enable the mod_rewrite and restart apache:

Finally, approve the modification of the .htaccess file – just click on the “Update mod_rewrite rules”. It will add the following section to the file

So far so good. Let’s check the results of our work and measure the response time… The result averages at about 2.3 seconds. Not bad, but still more than two times slower than expected.

OPTIMIZATION 5 – CONTENT DELIVER NETWORK (CDN)

It’s gonna be tricky now as we cannot do much more tweaking of Apache. Let’s use Gtmetrics once more and analyze the response timeline. In the picture below we can see that the first request that fetches the HTML content takes around 750ms – which matches our expectations. There are, however, 24 additional slower requests that fetches images, stylesheets and javascripts. We could try to combine these files and limit the count of the requests, but that’s not that straightforward. All these files have a one thing in common though – they are static. OK, you may change a CSS file once in a while, but basically it’s the HTML files that change more frequently. What about using a Content Delivery Network (CDN) to mirror these files, so they can be downloaded quickly with a low-latency from any place in the world? That sounds reasonable, doesn’t it?

I had a quick look what is available on the market and shortlisted two providers: Amazon Cloudfront and MaxCDN. I have chosen the Amazon Cloudfront service as I liked its business pay-per-use model, whereas MaxCDN cost around 40 USD for the basic package. So, I signed up at Amazon and logged in to the Amazon AWS Cloudfront Management Console. The configuration is pretty straightforward. You just have to create a “distribution” and enter the origin domain name in the origins section.

Then you just configure the CDN behavior (in my case it was only HTTP/HTTPs traffic – I also wanted to use origin Cache headers that we have already configured)

Then click OK, and wait till the distribution state has been switched to “Enabled” state. Finally, jot down check the address of your shiny CDN server, in my case it was: d15618vwtt9nw5.cloudfront.net

So let’s review what we have already done. Basically, we created a CDN distribution that mirrors the content of our site. It works in such a way that whenever you hit the CDN server with a specific link it replaces the base of the URL. Then it hits the origin server, fetches the content and caches it internally using the Cache headers – after that it serves the content to the client. As long as the content does not expiry the CDN server does not have to hit your origin server to serve resources. The advantage of CDN is that the servers are distributed across the globe meaning that you can quickly access cached resources from any place. Let’s think of it as an ultra-fast and distributed HTTP cache. OK. But we still have to make our server to delegate the traffic to the CDN server. It would be perfect if we could have a fine-grained control what traffic delegate and what not. We don’t want, for example, to cache HTTP pages, since every time we post a new blog entry we would have to invalidate the old content. And remember, invalidating CDN cache entries is expensive, so we don’t want to do it too often.

There are some CDN plugins that try to do the traffic delegation, but they are limited and do not fulfill my requirements. They cannot process all WordPress files, often are limited to Media files only, and the configuration is pretty complex. Applying the KISS methodology once more, I would like to do the configuration in the simplest possible way. Gtmetrics was helpful in pointing out which files could be moved to the CDN server – see the screenshot below:

OK. So let’s try to configure the traffic delegation. First shot – let’s use the Apache’s mod_redirect and configure redirect rules, so that selected requests are redirected to the CDN server. So far, so good, but in this case, our server would still get 25 requests (out of which 24 are redirects). If the server delays handling them because of high traffic the CDN is not helpful at all. Performed tests have proven my assumption. Response time was a bit better, but not much better.

The perfect solution would be to modify the content of the page while it’s being sent to the client, so that all the selected links point to the CDN server. We could configure it using mod_proxy and ProxyHTMLURLMap, but again, proxying the site server by the same apache locally does not seem KISS and would double the apache’s load. Having done some research, I have finally found what I was looking for – it’s called mod_substitute. To enable it simply invoke:

Then edit the .htaccess file to configure the substitution rules.
Here’s my configuration:

mod_substitute will modify the content of the HTTP response on the fly, replacing the content according to the substitution rules. In my case it will modify some URLs pointing them to the CDN server. It’s flexible, easily configurable, does not depend on any WordPress plugins and can be easily disabled. So right now, all the static content will be fetched from the CDN server. It means that my VPS will be hit only once to get the HTML file, all other stuff will be downloaded from the CDN server. If you are concerned about the cost of the CDN service I will calm you down. I have paid 6 cents for the last 2 weeks. If you expect horrendous traffic you will pay more of course – but for personal blog it will never reach more than a few dollars a month. So, let’s use gtmetrics for the last time…

THE RESULT

The result is amazing. As you can see in the picture below the response time went down to 993ms. OK, If you took an average it would be a bit longer, but nevertheless I was fully satisfied! All the static content is served by the CDN server – look how quick it is – around 15ms per resource! I paid 50 USD for the VPS and a few bucks for the CDN. All in all, I fulfilled all of the requirements specified, kept the config KISS and had a lot of fun!

As you can see, long hours spent on playing with Apache were fruitful. Not only did I learn a lot but also configured a pimped-up ligthening-fast WordPress blog. I hope you enjoyed it and I am anxious to hear your testimonies!

Comments

  1. lmoren says:

    Great explanation. Have you tried solutions like cloudflare.com? I’m wondering what would be the improvement with using such platform. cloudflare is free service and very easy to setup.

    • Tom Bujok says:

      Hi,

      Thanks for your feedback. I have never tried cloudflare.com, but it looks interesting. It may be a nice alternative to the Amazon CloudFront platform.

      Cheers,
      Tom

  2. Paul says:

    Hi,

    Thanks for the detailed exposé about your setup. I was wondering: how does one evaluate in the response time for a given website is good or not? Is it absolute (like: whatever the website, it should always be under 1s) or relative to the type of content being delivered? Pingdom is telling me that the average response time for my website on past 30 days is around 600ms I think it’s decent but I’m not sure how it compares to other sites.

    Paul

  3. wim says:

    Nice Article, but what font are you using? It has holes in almost every letter and is really hard to read. Now your pages load in a jiffy but the deciphering the font takes ages ;)

  4. wim says:

    Nice Article, but what font are you using? It has holes in almost every letter and is really hard to read. Now your pages load in a jiffy but deciphering the font takes ages ;)

  5. shalabh tayal says:

    Excellent blog sir ! very well written

  6. optimizing is not that easy really but the tips you have given here are pretty elaborated and will help many people for sure. specially beginners will be able to optimize their self hosted wordpress sites much better following these valuable tips and get good page speed score. i am in the chase of getting good score and some of your tips gonna help me i am sure :) thanks and looking forward for more titles.

    retweet done.

    Robin.

  7. Robin says:

    Had no idea a CDN would do such a difference – amazing! Thanks for publishing this, will implement the CDN on my site!

  8. Cloud Server says:

    Thanks for the sharing.

    Really very knowledgeable and well defined solution.

  9. Chathu says:

    Could you please tell me what’s the reason that you mentioned disable gzip if it’s enabled in Apach? I use apach+nginx+w3tc. All of them enabled gzip.

  10. Awesome things here. I am very happy to see your article.

    Thank you so much and I’m taking a look forward to contact you.
    Will you please drop me a mail?

Speak Your Mind

*