Always Get Better

Never stop looking for ways to improve

November 4th, 2011

Hopefully when you do web work, you’re not developing code on the same server your users are accessing. Most organizations have at least some kind of separation for their development and production code, but it’s possible to go far further. Separating environments allows you to achieve multiple threads of continuous integration for all kinds of cool.

These normally break down as follows:

Development
Working code copy. Changes made by developers are deployed here so integration and features can be tested. This environment is rapidly updated and contains the most recent version of the application.

Quality Assurance (QA)
Not all companies will have this. Environment for quality assurance; this provides a less frequently changed version of the application which testers can perform checks against. This allows reporting on a common revision so developers know whether particular issues found by testers has already been corrected in the development code.

Staging/Release Candidate
This is the release candidate, and this environment is normally a mirror of the production environment. The staging area contains the “next” version of the application and is used for final stress testing and client/manager approvals before going live.

Production
This is the currently released version of the application, accessible to the client/end users. This version preferably does not change except for during scheduled releases. There may be differences in the production environment but generally it should be the same as the staging environment.

Having separation between the different environments is not tricky, but managing your data environment can be. There are, of course, all kinds of ways to solve the problem.

April 12th, 2011

One of my favourite aspects of the cloud is the ease with which we can create new VMs to test our wacky architecture theories. It’s so easy (and cheap!) to spin up a small server cluster for some serious load testing, and then destroy it again when done.

If nothing else, it provides a safety net and teaches you how to squeeze every ounce of performance out of big and small server instances. Let’s examine ways in which we can make our dynamic Apache settings much faster.

Turn Off Modules You’re Not Using
This should be fairly obvious, but Apache ships with a number of modules which can affect performance but which most of us never need. Check your /etc/apache/mods-enabled folder to see what can be removed.

Never Trust Defaults
The default Apache settings are optimized for a website serving static files only. Booorrring! Never be afraid to question what you see in the configuration files; the more you understand about the inner workings of the system, the better you will be able to improve its performance.

RAM is good, Swap is Bad
Running out of physical memory (RAM) and hitting the hard drive’s swap space is bad, especially in the Virtual Machine world. When this happens your performance will nose dive; your machine may even crash. The simplest solution is to increase the amount of RAM available to your server, but if that is too costly or impossible, read on.

Kill the KeepAlive
Whenever a request is made to the web server, it keeps the network connection open for a small amount of time (often 15 seconds). During that time, if the visitor’s web browser needs to get another file, it goes through the same connection thereby avoiding wasting time re-connecting to your server. The problem is the open connection will use up space in your connection pool so if your site is under heavy load new visitors will get queued up and may experience slowdowns trying to access your content.

If Apache is your front-end web server, set the KeepAliveTimeout to 2 seconds. This will keep the number of requests fluid even under heavy load.

If your server is behind a firewall like nginx or HAProxy where KeepAlives are not honoured, turn this setting off entirely.

Don’t Serve Static Files
Apache is a memory hog. Since each hit to the server is relatively heavy in terms of threads and memory, we are in better shape when we serve non-changing static content like images, stylesheets and javascript using a single-threaded server like nginx or lighttpd or even a memory server like varnish (bonus points for using a CDN to serve static files, avoiding the hit to your server at all).

Turn off HostnameLookups
This should already be done by default in your Apache configuration; if it isn’t, do it now. When HostnameLookups is on, Apache checks every incoming request’s IP address for its host name. This can dramatically increase your latency, and isn’t healthy for DNS servers either.

Disable AllowOverride
It is tempting to set AllowOverride to All in order to give your .htaccess files free reign to do as they please. The downside of this directive is that every time anything is requested Apache will need to check that folder and every one of its parents all the way down to the site root in order to check for .htaccess commands. Apache recommends setting AlloverOverride to none globally, enabling access for .htaccess files that can’t be set in the site configuration.

April 11th, 2011

Play framework must be the best-kept secret in the Java world. If you haven’t had a chance to see their totally awesome demonstration video where they build a full app before your eyes in a matter of minutes, go – go now. Then come back.

Why do I like this framework so much? Put simply, it is an elegant solution for nearly every problem I’ve ever run into in developing websites, both single-server and multi-server applications. Don’t take my word for it, see for yourself.

The Goodness of Java
To my mind, Java has the edge over more common scripting languages (like PHP) because it is compiled (fast) and statically typed (reliable). In the past, using compiled languages on the web was only possible if you were using ASP.NET or willing to put up with the hassles of existing Java frameworks and servers.

Play’s first innovation comes from wrapping the Java runtime inside a Python web server; using a Play application is as easy as running a command line script and connecting with your web browser. Play’s second innovation is its just-in-time compilation and display of error messages; if you make a mistake you will know in the amount of time it takes to hit refresh on your web browser.

Since it IS java, programmers can use libraries they have built for other applications or sourced from other vendors and plug directly into their code. This is one of the advantages Microsoft has had going for it and it is good to see it implemented so nicely in the open source world.

The Ease of Rails
Love it or hate it, Ruby on Rails has had an affect on the entire web world and its reach is definitely felt in the Play framework. Everything from the routing to JPA integration has that minimal-configuration design that is so prevalent in the Ruby world. Play has the edge though, due to Java annotations and the extra control you get as a developer.

Baked-in Unit Testing
Admittedly this is the first thing that drew me to the Play framework. Unit testing has to be one of the most important aspects of good programming; in fact, if your code is not covered by unit tests, I argue it is incomplete. Play has terrific support for unit testing, functional testing and selenium-based web testing. In version 1.1, Play added a headless web testing mode, paving the way to run framework applications in the context of an automated build environment – smart move!

Although awkward at first, using YAML files for database fixtures makes a lot of sense. Managing database access in unit tests has always been a challenge but thanks to the in-memory database server and fixture files Play offers us database integration testing – giving us the fresh-start benefits of mock frameworks along with the soundness of mind that comes from knowing you are testing the real database.

Share Nothing Architecture
Call it laziness, call it human error. At some point in the development cycle, the session always seems to end up carrying user data around. Even with data-sharing applications like memcached, that style of development does not scale well. With Play, sessions are stored in user cookies and consist of an encrypted key. The idea is the application takes care of loading any additional information it needs from this seed information, so the web cluster can be expanded to hundreds of nodes or reduced to a single server with no performance penalties on the other servers. Each Play instance operates as if it is the only one in existence, making it far easier to support complex site architectures.

April 10th, 2011

It’s hard not to love memcache. As soon as you manage a web site that has more than a few concurrent visitors, the performance benefit of caching becomes immediately obvious. MySQL is a fast database and can outperform a lot of its competitors, but no matter how quickly it can pull results it can never outperform the retrieval speed of the server’s RAM.

The basic premise is: instead of pulling a model out of the database, see if it has already been loaded into memory by checking a key-value diction (for example: User5677). If the user has not been read from the database yet, the key-value store will be empty and we can fetch the record. Next time we need that data we check the key-value again and avoid querying the database.

This really saves us whenever we have data that changes infrequently. Take, for example, an ecommerce website: since the products and categories on the site will change very rarely, it makes a lot of sense to store them in memory for fast recovery. Even more volatile information (like user data) can be stored in the cache, as long as the application knows to empty that cache key when the data gets changed.

Memcache is an ideal tool for managing these kinds of caches, and provides a lot of flexibility for growth.

History Lesson
Earlier this week I promised to go deeper into memcache’s origins. Memcache was originally developed at Danga as a way to reduce the database load and improve the speed of LiveJournal.

Rather than developing a standalone server application, Danga’s engineers designed memcache to sit on lower-end hardware and on web servers where it would use a small amount of the overal memory. Memcache instances don’t talk to each other: the client machines are aware of all the memcache instances and attempt to write their information evenly to each. This allows memcache to scale almost limitlessly without adding significant overhead to the caching process.

When to Use
Quite simply: if you’re building an application for the LAMP stack, build in memcache support. When treated as a necessary component from the beginning, caching support adds almost zero overhead to development; however it will always pay off as soon as real world traffic is coming to your site.

April 8th, 2011

From a branding perspective, social media is about joining the conversation rather than trying to constantly send out broadcasts. Any idea worth discussing is already being talked about – if you ignore social media you aren’t just failing to get your message out into the wild; you are, in fact, allowing your voice to be absent from the existing discussion. There is a seismic shift occurring in the way brands and their respective owners are thinking about engaging their target audience. It isn’t good enough to just get the message out anymore – more attention is being placed into measuring the effectiveness of that message.

This isn’t a new idea; in fact, people have been talking about brands for as long as brands have existed. It’s well known that behind every customer who speaks up about their disappointment or service problem are ten others who simply switched to a different supplier. Figuring out what people are saying “on the street” and reacting to improve based on customer expectations isn’t a new concept; Facebook, Twitter and the blogosphere are only tools that make this much easier – they did not invent the conversation. So what’s the big deal?

The difference we are seeing today is the easy access to information that was not present before. Employees at all levels of the organization have access to the same outside data, the same instant feedback to everything being done. Ofttimes the worker at the lowest level has more sense of customer feelings than does the decision-making upper management – this has always been true, of course, so why the sudden magnification?

I believe we are seeing a generational change in business and mindset that is putting people ahead of function. Call it Generation X (over-workers to a fault) passing the torch over to Generation Y (family-focused individuals). In the next several years we are going to see a greater focus toward grassroots-based marketing efforts and a continuation of the trend toward niche-based services alongside the dismantling of mainstream distribution channels.

How to control this? Don’t. Service the customer and listen to their feedback. The same ingredients that have always made businesses successful are still in place: the difference is it is now easier than ever to hear the feedback faster.

January 19th, 2010

Hey folks, it’s been awhile!

I’ve been playing with Google’s Go language, and will be sharing what I’ve learned over the coming weeks.

First off, which seems like the easier way to compile your source code? This:

6g fib.go
6l fib.6
mv 6.out fib

or this?


make

Personally, I prefer using a Makefile, even for a small project with one source file.

Without further adieu, a simple Makefile for the Go Language:


GC = 6g
LD = 6l
TARG = fib

O_FILES = fib.6

all:
make clean
make $(TARG)

$(TARG): $(O_FILES)
$(LD) -o $@ $(O_FILES)
@echo "Done. Executable is: $@"

$(O_FILES): %.6: %.go
$(GC) -c $<

clean:
rm -rf *.[$(OS)o] *.a [$(OS)].out _obj $(TARG) *.6

November 5th, 2009

If you’re using SQL Server Management Studio Express under Windows Vista and see either of these errors:

CREATE DATABASE permission denied in database 'master'

or

The database [Name] is not accessible. (Microsoft.SqlServer.Express.ObjectExplorer)

Here’s the fix:

  1. Close SQL Server Management Studio Express
  2. Open your start menu and locate that program.
  3. Right-click on the Management Studio and choose ‘Run as Administrator’
  4. Fixed!

I swear the simplest solutions can be the hardest to find – hopefully this saves someone (or my forgetful self!) some aggravation.