Always Get Better

Never stop looking for ways to improve

April 11th, 2011

Play framework must be the best-kept secret in the Java world. If you haven’t had a chance to see their totally awesome demonstration video where they build a full app before your eyes in a matter of minutes, go – go now. Then come back.

Why do I like this framework so much? Put simply, it is an elegant solution for nearly every problem I’ve ever run into in developing websites, both single-server and multi-server applications. Don’t take my word for it, see for yourself.

The Goodness of Java
To my mind, Java has the edge over more common scripting languages (like PHP) because it is compiled (fast) and statically typed (reliable). In the past, using compiled languages on the web was only possible if you were using ASP.NET or willing to put up with the hassles of existing Java frameworks and servers.

Play’s first innovation comes from wrapping the Java runtime inside a Python web server; using a Play application is as easy as running a command line script and connecting with your web browser. Play’s second innovation is its just-in-time compilation and display of error messages; if you make a mistake you will know in the amount of time it takes to hit refresh on your web browser.

Since it IS java, programmers can use libraries they have built for other applications or sourced from other vendors and plug directly into their code. This is one of the advantages Microsoft has had going for it and it is good to see it implemented so nicely in the open source world.

The Ease of Rails
Love it or hate it, Ruby on Rails has had an affect on the entire web world and its reach is definitely felt in the Play framework. Everything from the routing to JPA integration has that minimal-configuration design that is so prevalent in the Ruby world. Play has the edge though, due to Java annotations and the extra control you get as a developer.

Baked-in Unit Testing
Admittedly this is the first thing that drew me to the Play framework. Unit testing has to be one of the most important aspects of good programming; in fact, if your code is not covered by unit tests, I argue it is incomplete. Play has terrific support for unit testing, functional testing and selenium-based web testing. In version 1.1, Play added a headless web testing mode, paving the way to run framework applications in the context of an automated build environment – smart move!

Although awkward at first, using YAML files for database fixtures makes a lot of sense. Managing database access in unit tests has always been a challenge but thanks to the in-memory database server and fixture files Play offers us database integration testing – giving us the fresh-start benefits of mock frameworks along with the soundness of mind that comes from knowing you are testing the real database.

Share Nothing Architecture
Call it laziness, call it human error. At some point in the development cycle, the session always seems to end up carrying user data around. Even with data-sharing applications like memcached, that style of development does not scale well. With Play, sessions are stored in user cookies and consist of an encrypted key. The idea is the application takes care of loading any additional information it needs from this seed information, so the web cluster can be expanded to hundreds of nodes or reduced to a single server with no performance penalties on the other servers. Each Play instance operates as if it is the only one in existence, making it far easier to support complex site architectures.

April 10th, 2011

It’s hard not to love memcache. As soon as you manage a web site that has more than a few concurrent visitors, the performance benefit of caching becomes immediately obvious. MySQL is a fast database and can outperform a lot of its competitors, but no matter how quickly it can pull results it can never outperform the retrieval speed of the server’s RAM.

The basic premise is: instead of pulling a model out of the database, see if it has already been loaded into memory by checking a key-value diction (for example: User5677). If the user has not been read from the database yet, the key-value store will be empty and we can fetch the record. Next time we need that data we check the key-value again and avoid querying the database.

This really saves us whenever we have data that changes infrequently. Take, for example, an ecommerce website: since the products and categories on the site will change very rarely, it makes a lot of sense to store them in memory for fast recovery. Even more volatile information (like user data) can be stored in the cache, as long as the application knows to empty that cache key when the data gets changed.

Memcache is an ideal tool for managing these kinds of caches, and provides a lot of flexibility for growth.

History Lesson
Earlier this week I promised to go deeper into memcache’s origins. Memcache was originally developed at Danga as a way to reduce the database load and improve the speed of LiveJournal.

Rather than developing a standalone server application, Danga’s engineers designed memcache to sit on lower-end hardware and on web servers where it would use a small amount of the overal memory. Memcache instances don’t talk to each other: the client machines are aware of all the memcache instances and attempt to write their information evenly to each. This allows memcache to scale almost limitlessly without adding significant overhead to the caching process.

When to Use
Quite simply: if you’re building an application for the LAMP stack, build in memcache support. When treated as a necessary component from the beginning, caching support adds almost zero overhead to development; however it will always pay off as soon as real world traffic is coming to your site.

April 9th, 2011

By default PHP loads and saves sessions to disk. Disk storage has a few problems:

1. Slow IO: Reading from disk is one of the most expensive operations an application can perform, aside from reading across a network.
2. Scale: If we add a second server, neither machine will be aware of sessions on the other.

Enter Memcached
I hinted at Memcached before as a content cache that can improve application performance by preventing trips to the database. Memcached is also perfect for storing session data, and has been supported in PHP for quite some time.

Why use memcached rather than file-based sessions? Memcache stores all of its data using key-value pairs in RAM – it does not ever hit the hard drive, which makes it F-A-S-T. In multi-server setups, PHP can grab a persistent connection to the memcache server and share all sessions between multiple nodes.

Installation
Before beginning, you’ll need to have the Memcached server running. I won’t get into the details for building and installing the program as it is different in each environment, but there are good guides on the memcached site. On Ubuntu it’s as easy as aptitude install memcached. Most package managers have a memcached installation available.

Installing memcache for PHP is hard (not!). Here’s how you do it:


pecl install memcache

Careful, it’s memcache, without the ‘d’ at the end. Why is this the case? It’s a long story – let’s save the history lesson for another day.

When prompted to install session handling, answer ‘Yes’.

Usage
Using memcache for sessions is as easy as changing the session handler settings in PHP.ini:
session.save_handler = memcache
session.save_path = “tcp://127.0.0.1:11211″ (assuming memcached is set up to use default port)

Now restart apache (or nginx, or whatever) and watch as your sessions are turbo-charged.

March 3rd, 2011

If you are running something like Google Analytics on your website, you probably don’t want its associated JavaScript code to appear in your web browser while you’re developing (it would skew your statistics). In Rails, it is incredibly simple to block off a segment of markup for specific environments by using the Rails.env variable.

If you are using Passenger, your rails environment is set to ‘production’ by default, making it very easy to do something like this:

December 28th, 2010

The MongoDB classes do not come with default generators for Rails applications. In order to use the rails generate commands, you can use the indirect rails3-generators project available at https://github.com/indirect/rails3-generators/#readme

Installation:
1. Download the generators into your application’s lib folder:
cd myapp
git clone git://github.com/indirect/rails3-generators.git lib/generators

2. Install the rails3-generators gem
gem install rails3-generators

3. Add this to your project’s Gemfile
gem 'rails3-generators'

Usage:
rails generate model ModelName --orm=mongo_mapper

December 18th, 2010

There are better tools that give more information than this one, but I wanted to point out the reports generated by BuiltWith.

BuiltWith scans your site and displays a nice description of site components it finds – useful for non-tech users who need to gain a quick understanding of the architecture used by a competitor or interesting site.

November 20th, 2010

I like PHP because it makes it really easy to quickly build a website and add functionality, and is generally lightning fast when executed without needing to wait for compilation as with ASP.NET or Java (Yes, we always pre-compile those languages prior to putting our applications into production, but with PHP we don’t even have to do that).

Even though compilation is very fast, it still has a resource and time cost, especially on high-traffic servers. We can improve our response times by more than 5x by pre-caching our compiled opcode for direct execution later. There are a few PHP accelerators which accomplish this for us:

Xcache
Xcache is my favourite and is the one I use in my own configurations. It works by caching the compiled PHP opcode in memory so PHP can be directly executed by the web server without expensive disk reads and processing time. Many caching schemes also use Xcache to store the results of PHP rendering so individual pages don’t need to be re-processed.

APC (Alternative PHP Cache)
APC is a very similar product to XCache – in fact XCache was released partially as a response to the perceived lag APC’s support for newer PHP versions. APC is essentially the standard PHP Accelerator – in fact, it will be included by default in PHP 6. As much as I like XCache, it will be hard to compete with built-in caching.

MMCache
Turck MMCache is one of the original PHP Accelerators. Although it is no longer in development, it is still widely used. An impressive feature of MMCache is its exporter which allows you to distribute compiled versions of your PHP applications without the source code. This is useful for those companies that feel they need to protect their program code when hosting in client environments.

eAccelerator
eAccelerator picked up where MMCache left off, and added a number of features to increase its usability as a content cache. Over time, the content caching features have been removed as more efficient and scalable solutions like memcache have allowed caches to be shared across web servers.

Keep Optimizing
One major consideration that often goes forgotten when optimizing website speeds that not all of your visitors will be using a high-speed connection; some users will be using mobile or worse connections, even for non-mobile sites. Every ounce of speed will reflect favourably on you and improve your retention rates – and ultimately get more visitors to your ‘call to action’ goals. I’ll go into more detail about bigger speed improvements we can make in a later post.