Always Get Better

Never stop looking for ways to improve

January 30th, 2012

All web site owners should feel a burning need to speed. Studies have shown that viewers waiting more than 2 or 3 seconds for content to load online are likely to leave without allowing the page to fully load. This is particularly bad if you’re trying to run a web site that relies on visitors to generate some kind of income – content is king but speed keeps the king’s coffers flowing.

If your website isn’t the fastest it can be, you can take some comfort in the fact that the majority of the “top” web sites also suffer from page load times pushing up into the 10 second range (have you BEEN to Amazon lately?). But do take the time to download YSlow today and use its suggestions to start making radical improvements.

I’ve been very interested in web server performance because it is the first leg of the web page’s journey to the end user. The speed of execution at the server level is capable of making or breaking the user’s experience by controlling the amount of ‘lag time’ between the web page request and visible activity in the web browser. We want our server to send page data as immediately as possible so the browser can begin rendering it and downloading supporting files.

Not long ago, I described my web stack and explained why I moved away from the “safe” Apache server solution in favour of nginx. Since nginx doesn’t have a PHP module I had to use PHP’s FastCGI (PHP FPM) server with nginx as a reverse proxy. Additionally, I used memcached to store sessions rather than writing to disk.

Here are the configuration steps I took to realize this stack:

1. Memcached Sessions
Using memcached for sessions gives me slightly better performance on my Rackspace VM because in-memory reading&writing is hugely faster than reading&writing to a virtualized disk. I went into a lot more detail about this last April when I wrote about how to use memcached as a session handler in PHP.

2. PHP FPM
The newest Ubuntu distributions have a package php5-fpm that installs PHP5 FastCGI and an init.d script for it. Once installed, you can tweak your php.ini settings to suit, depending on your system’s configuration. (Maybe we can get into this another time.)

3. Nginx
Once PHP FPM was installed, I created a site entry that would pass PHP requests forward to the FastCGI server, while serving other files directly. Since the majority of my static content (css, javascript, images) have already been moved to a content delivery network, nginx has very little actual work to do.


server {
listen 80;
server_name sitename.com www.sitename.com;
access_log /var/log/nginx/sitename-access.log;
error_log /var/log/nginx/sitename-error.log;
# serve static files
location / {
root /www/sitename.com/html;
index index.php index.html index.htm;

# this serves static files that exists without
# running other rewrite tests
if (-f $request_filename) {
expires 30d;
break;
}

# this sends all-non-existing file or directory requests to index.php
if (!-e $request_filename) {
rewrite ^(.+)$ /index.php?q=$1 last;
}
}

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /www/sitename.com/html$fastcgi_script_name;
include fastcgi_params;
}
}

The fastcgi_param setting controls which script is executed, based upon the root path of the site being accessed. All of the requests parameters are passed through to PHP, and once the configuration is started up I didn’t miss Apache one little bit.

Improvements
My next step will be to put a varnish server in front of nginx. Since the majority of my site traffic comes from search engine results where a user has not yet been registered to the site or needs refreshed content, Varnish can step in and serve a fully cached version of my pages from memory far faster than FastCGI can render the WordPress code. I’ll experiment with this setup in the coming months and post my results.

April 9th, 2011

By default PHP loads and saves sessions to disk. Disk storage has a few problems:

1. Slow IO: Reading from disk is one of the most expensive operations an application can perform, aside from reading across a network.
2. Scale: If we add a second server, neither machine will be aware of sessions on the other.

Enter Memcached
I hinted at Memcached before as a content cache that can improve application performance by preventing trips to the database. Memcached is also perfect for storing session data, and has been supported in PHP for quite some time.

Why use memcached rather than file-based sessions? Memcache stores all of its data using key-value pairs in RAM – it does not ever hit the hard drive, which makes it F-A-S-T. In multi-server setups, PHP can grab a persistent connection to the memcache server and share all sessions between multiple nodes.

Installation
Before beginning, you’ll need to have the Memcached server running. I won’t get into the details for building and installing the program as it is different in each environment, but there are good guides on the memcached site. On Ubuntu it’s as easy as aptitude install memcached. Most package managers have a memcached installation available.

Installing memcache for PHP is hard (not!). Here’s how you do it:


pecl install memcache

Careful, it’s memcache, without the ‘d’ at the end. Why is this the case? It’s a long story – let’s save the history lesson for another day.

When prompted to install session handling, answer ‘Yes’.

Usage
Using memcache for sessions is as easy as changing the session handler settings in PHP.ini:
session.save_handler = memcache
session.save_path = “tcp://127.0.0.1:11211″ (assuming memcached is set up to use default port)

Now restart apache (or nginx, or whatever) and watch as your sessions are turbo-charged.

November 20th, 2010

I like PHP because it makes it really easy to quickly build a website and add functionality, and is generally lightning fast when executed without needing to wait for compilation as with ASP.NET or Java (Yes, we always pre-compile those languages prior to putting our applications into production, but with PHP we don’t even have to do that).

Even though compilation is very fast, it still has a resource and time cost, especially on high-traffic servers. We can improve our response times by more than 5x by pre-caching our compiled opcode for direct execution later. There are a few PHP accelerators which accomplish this for us:

Xcache
Xcache is my favourite and is the one I use in my own configurations. It works by caching the compiled PHP opcode in memory so PHP can be directly executed by the web server without expensive disk reads and processing time. Many caching schemes also use Xcache to store the results of PHP rendering so individual pages don’t need to be re-processed.

APC (Alternative PHP Cache)
APC is a very similar product to XCache – in fact XCache was released partially as a response to the perceived lag APC’s support for newer PHP versions. APC is essentially the standard PHP Accelerator – in fact, it will be included by default in PHP 6. As much as I like XCache, it will be hard to compete with built-in caching.

MMCache
Turck MMCache is one of the original PHP Accelerators. Although it is no longer in development, it is still widely used. An impressive feature of MMCache is its exporter which allows you to distribute compiled versions of your PHP applications without the source code. This is useful for those companies that feel they need to protect their program code when hosting in client environments.

eAccelerator
eAccelerator picked up where MMCache left off, and added a number of features to increase its usability as a content cache. Over time, the content caching features have been removed as more efficient and scalable solutions like memcache have allowed caches to be shared across web servers.

Keep Optimizing
One major consideration that often goes forgotten when optimizing website speeds that not all of your visitors will be using a high-speed connection; some users will be using mobile or worse connections, even for non-mobile sites. Every ounce of speed will reflect favourably on you and improve your retention rates – and ultimately get more visitors to your ‘call to action’ goals. I’ll go into more detail about bigger speed improvements we can make in a later post.

January 24th, 2010

When running a default installation, WAMP Server’s Apache crashes when installing X-Cart. The error happens immediately after setting up the MySQL tables and is caused by the curl extension in PHP.

To resolve, click on the WAMP Server icon in the task tray, go to Apache -> Version -> Get More. Download one of the 2.0 series Apache servers and install it.

Repeat the process for PHP; download one of the 5.0 series PHP versions and install it.

Switch WAMP Server to the new versions and run the install again – it should work with no problems.

May 2nd, 2008

Has anyone else had this issue?

When I try to install Drupal on a Windows 2003 Apache server, I get a pause and then “Connection Reset” error under Firefox.  If I then try to install it using Internet Explorer, the installation process comes up immediately and works without a hitch.

I still can’t seem to get the admin ‘Modules’ page to load at all – PHP is crashing.  That is an entirely separate issue as far as I can tell.

April 21st, 2008

Recently I had to set up a WordPress blog on a Windows 2003 server, which involved installing Apache, PHP and MySQL.  Everything went pretty well – except for the typical PHP.ini problems I shall write about later.

Once the database was set up and I had the blog running, I tried to adjust the permalink structure using .htaccess rewrite rules so they would be more human-readable.

Out of the box, mod_rewrite isn’t enabled on Windows Apache.

The steps I took to resolve the problem were:

  1. Uncomment the line containing ‘LoadModule mod_rewrite modules/mod_rewrite.so
  2. In the VirtualHost directive hosting the blog, add ‘AllowOveride All
  3. Restart Apache

It works like a charm, now.  All the tools ship with Apache, it’s just a matter of configuring them as needed.

February 11th, 2008

There is a lot of conflicting advice with regard to the proper way of using SQL connections in web work.  The two main schools of thought are:

 

  1. Open a connection once, use it for everything and free it when the page request is complete.
  2. Open a connection only when it is needed, then fre it right away.

 

It All Comes Down to Speed

Any operation performed by the server has a cost associated with it.  Loops and counts happen on the CPU and are incredibly fast.  Caches are stores in system memory and seem fast but are in fact thousands of times slower than direct CPU operation.

 

As far as internal server access goes, reading and writing to the hard drive is representative of the most expensive operation performed by applications.  If possible, memory is oftent he best storage medium for program instructions.

 

Consider Database Access Costs

Databases live in a realm somewhere between hard drives and memory.  A good DBMS reacts to requests by storing execution plans and data in system memory thereby reducing the amount of time applications have to wait for information to load off the hard drive.

 

In terms of execution costs, connecting to the database is up there with reading from the hard drive.  Is a database connetion isn’t needed, don’t create one.

 

Which is better?

Considering the high price of connecting to a database, which of the following makes the most sense for our web application?

 

  1. Creating the connection to the database, performing all operations, and closing the connection after the is complete, org;
  2. For every request, creating a connection, executing a query, then closing the connection.  Repeat every time a new piece of information is required.

 

In classic ASP’s ADO, #1 was the best option.  In modern ASP.NET’s ADO.NET, #2 is the way to go.  That was a trick question.

 

Classic ASP – ADO

Using classic ASP, the developer has a great deal of control over the sequence in which their code will run.

 

It’s very easy to open a single connection at the top of a web page and use it for all of that page’s database operations.  At the end of the page, the developer writes code logic to free all record sets used, close memory objects, and release the connection.

 

ADO.NET

When writing code for ASP.NET, we strive to be “good” .NET citizens.  We are given a lot of power to write application logic never before possible on the web, and we are asked to relinquish some of our finer-grained control over resources.

 

Generally, managed resources benefits the programmer by freeing them to focus more on the business requirements of their project rather than their specific environment’s implementation.  Database connectivity, however, is a concept that has to be updated to fit the .NET paradigm.

 

Whereas before we controlled when database resources were released to the system, now the garbage collector has taken responsibility for the task.  If we open a database connection when the page request begins, we have no real guarantee it will be released in a timely manner when the page request completes.

 

This means the only way to prevent the server from running out of connections is by creating, using, and immediately freeing them every time we want to retrieve information from the database.

 

What about the cost of execution?  One would assume we’ve taken two steps back but ASP.NET introduces connection pooling for this situation.

 

Connection Pooling is your friend

Any time a connection is requested, the server looks to a special pool of memory objects where it keeps previously used database connections.  If there are any connections no longer being used, the server will return those unused connections before creating a new one.

 

Essentially the server only incurs the cost of creating a connection the first time it is used, or whenever demand for a particular application spikes.

 

When our program is done with the database, it releases the connection back into the pool for other requests.

 

Never Create Global Connection Objects

The morale of the story is: work with the .NET framework and let it do its job.  As long as we are willing to let go of old ways of thinking and adapt to the new technologies, our programming lives will become ever simpler.

 

Every day web applications grow in their complexity.  Fortunately the power and capabilities of the tools we have available also continue to improve.

 

Expect databases to continue driving bigger and better web presences.  The more skilled we become at working with them the more likely we are to succeed in creating stable, reliable and scalable applications to take advantage of them.