Always Get Better

Never stop looking for ways to improve

April 30th, 2012

In just two weeks, Node: Up and Running will be released by O’Reilly Media. Writing a book has been a lot of hard work but also a terrific learning experience that I would love to repeat.

The biggest takeaway for me was how often I make stupid mistakes in my writing. As a developer and manager, I rely on my speaking and writing abilities every day – so I take my ability to express myself for granted because I have to do it every day.

When a professional editor takes a piece of writing, they aren’t looking at it in the same way a co-worker would. A co-worker knows me, understands some of the subtleties of the context I’m writing about, and can subconsciously apply meaning to ambiguities in the text or conversation. A casual reader doesn’t have the same context, and the copy editor is able to filter that out and make adjustments to the text that leave my meaning intact but change the delivery.

In other words, the text that came out of the editing process makes me look really smart (I wish!). I’ve learned the secret to clear communication is in keeping the message brief. Especially in a technical book, the audience can’t be expected to deconstruct prose – it’s up to the writer to make their point and get out of the way.

I’ve also learned that I use the same turns of phrases over and over again. Reading 50 pages of my own writing in a row with the same sentence transitions is boring as heck, and I’m able to see this strikingly clear when it’s annotated by a totally impartial writer.

February 29th, 2012

We can take for granted that whenever we introduce a library or framework to our application, we incur an overhead cost. The cost varies depending on what we’re trying to do, but we generally accept that the lost performance is worth it for the increased maintainability, functionality or ease of use.

For many teams, logging is something that gets thrown in the mix at the last minute rather than through of all the way through. That’s a shame because a well-implemented logging solution can make the difference between understanding what is going on in your system and having to guess by looking at the code. It needs to be lightweight enough that the overall performance is not affected, but feature-rich enough that important issues are surfaced properly.

Java programmers have had log4j for a long time, and log4net is a similarly mature solution in the .NET world. I’ve been watching log4php for awhile and now that it has escaped the Apache Incubator it is impressively full-featured and fast. But how much do all its features cost?

Benchmarks
I’ll be looking into different options as I go, but let’s consider a very basic case – you append all of your events to a text file. I’ve created a configuration that ignores all ‘trace’ and ‘debug’ events so only events with a severity of ‘INFO’ or above are covered.

In 5 seconds, this is what I saw:

Test Iterations
BASIC (direct PHP) 45,421
INFO STATIC 45,383
INFO DYNAMIC 41,847
INFO STATIC (no check) 51,801
INFO DYNAMIC (no check) 47,756
TRACE STATIC 310,255
TRACE DYNAMIC 213,554
TRACE STATIC (no check) 271,043
TRACE DYNAMIC (no check) 196,653

Tests
What is all that? There are two ways to initialize the logger class – statically, meaning declared once and used again and again; and dynamically, meaning declared each time. With log4X, we typically perform a log level check first, for example isTraceEnabled() to determine whether to proceed with the actual logging work.

Results
I was surprised by how little log4php actually lost in terms of speed versus raw PHP. The authors have clearly done a thorough job of optimizing their library because it runs at 90% of the speed of a direct access.

I’ve always intuitively used loggers as static variables – initialize once and use over and over. This seems to be the right way by a huge margin.

Checking for the log level before appending to the log was a big win for the INFO messages, which are always logged to the file due to the configuration settings. The intended use is to allow programmers to sprinkle their code with debug statements which don’t get processed – and therefore slow down – the production code. I would be very happen with this in my project. In the INFO metrics, the check slowed things down a bit – explained because the actual logging function performs the same check – so we are taking a double hit. But wait, there is a good reason…

The TRACE metric is interesting – these are events which are NOT appended to the log. In that case, when the check is not performed, we pass through the code more times. When the check is performed, the code has to execute deeper on the call stack before it figures out we aren’t doing any actual logging, taking more time.

Conclusion
If you know you will be logging every single event, don’t do a check. Otherwise do the check – it will save a lot of wasted cycles.

January 30th, 2012

All web site owners should feel a burning need to speed. Studies have shown that viewers waiting more than 2 or 3 seconds for content to load online are likely to leave without allowing the page to fully load. This is particularly bad if you’re trying to run a web site that relies on visitors to generate some kind of income – content is king but speed keeps the king’s coffers flowing.

If your website isn’t the fastest it can be, you can take some comfort in the fact that the majority of the “top” web sites also suffer from page load times pushing up into the 10 second range (have you BEEN to Amazon lately?). But do take the time to download YSlow today and use its suggestions to start making radical improvements.

I’ve been very interested in web server performance because it is the first leg of the web page’s journey to the end user. The speed of execution at the server level is capable of making or breaking the user’s experience by controlling the amount of ‘lag time’ between the web page request and visible activity in the web browser. We want our server to send page data as immediately as possible so the browser can begin rendering it and downloading supporting files.

Not long ago, I described my web stack and explained why I moved away from the “safe” Apache server solution in favour of nginx. Since nginx doesn’t have a PHP module I had to use PHP’s FastCGI (PHP FPM) server with nginx as a reverse proxy. Additionally, I used memcached to store sessions rather than writing to disk.

Here are the configuration steps I took to realize this stack:

1. Memcached Sessions
Using memcached for sessions gives me slightly better performance on my Rackspace VM because in-memory reading&writing is hugely faster than reading&writing to a virtualized disk. I went into a lot more detail about this last April when I wrote about how to use memcached as a session handler in PHP.

2. PHP FPM
The newest Ubuntu distributions have a package php5-fpm that installs PHP5 FastCGI and an init.d script for it. Once installed, you can tweak your php.ini settings to suit, depending on your system’s configuration. (Maybe we can get into this another time.)

3. Nginx
Once PHP FPM was installed, I created a site entry that would pass PHP requests forward to the FastCGI server, while serving other files directly. Since the majority of my static content (css, javascript, images) have already been moved to a content delivery network, nginx has very little actual work to do.


server {
listen 80;
server_name sitename.com www.sitename.com;
access_log /var/log/nginx/sitename-access.log;
error_log /var/log/nginx/sitename-error.log;
# serve static files
location / {
root /www/sitename.com/html;
index index.php index.html index.htm;

# this serves static files that exists without
# running other rewrite tests
if (-f $request_filename) {
expires 30d;
break;
}

# this sends all-non-existing file or directory requests to index.php
if (!-e $request_filename) {
rewrite ^(.+)$ /index.php?q=$1 last;
}
}

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /www/sitename.com/html$fastcgi_script_name;
include fastcgi_params;
}
}

The fastcgi_param setting controls which script is executed, based upon the root path of the site being accessed. All of the requests parameters are passed through to PHP, and once the configuration is started up I didn’t miss Apache one little bit.

Improvements
My next step will be to put a varnish server in front of nginx. Since the majority of my site traffic comes from search engine results where a user has not yet been registered to the site or needs refreshed content, Varnish can step in and serve a fully cached version of my pages from memory far faster than FastCGI can render the WordPress code. I’ll experiment with this setup in the coming months and post my results.

January 25th, 2012

Now that WebOS is being made open source, HP has released a new version of the Enyo JavaScript framework. Whereas the first version of the framework only supported Webkit-based environments (like the HP Touchpad, or Safari or Chrome), the newer version has expanded support for Firefox and IE9 as well. Developers who created apps with the old framework will have to wait a little while longer before all of the widgets and controls from Enyo 1.0 are ported over.

What does this mean for app developers? Now that Enyo is open-source, it means applications built on the platform will run on Android and iOS. But it’s not a disruptive technology – both Android and iOS have supported HTML5 applications for quite awhile; HP will be competing against mature frameworks like jQuery Mobile.

As a WebOS enthusiast I am definitely going to put some time into continuing my explorations of Enyo, but it’s getting harder and harder to justify the investment. My Pre is getting pretty old at this point, and hardware manufacturers have yet to express interest in making new devices to take advantage of WebOS. If I end up switching to Android with my next hardware purchase, it’s going to shift my priorities away from Enyo and its brethren.

January 16th, 2012

It’s hard to believe but this site is four years old. Wow! Time has flown, and I’ve learned a lot – hopefully these years have been helpful for you too!

January 14th, 2012
Mimbo - A Friendly Robot
Creative Commons License photo credit: langfordw

If you don’t want a search engine to read some or all of the files on your site, you can create a robots.txt file. (Looking through the blog archive, I realize I’ve never gone through the construction and contents of that important file, so this is a promise to one day return and fix that!)

When you want the opposite – accessible pages and author credit, create a humans.txt file. Although not an “official” standard, it is a fun way to acknowledge the (sometimes many) hardworking individuals behind the creation of a web site.

An example is:

/* TEAM */
Leader: Mike Wilson
Site: http://www.alwaysgetbetter.com
Twitter: HawkWilson
Location: Ottawa, ON

/* THANKS */
Seth Godin: http://www.sethgodin.com
Steve Pavlina: http://www.stevepavlina.com
Phil Haack: http://haacked.com

/* SITE */
Last Update: Jan 14, 2012
Standards: HTML5, CSS3
Software: WordPress

In most cases, you would want to include at least a TEAM and SITE section. Clearly the exact fields are left to your imagination, but it’s a very simple way to acknowledge the people who helps (directly or in spirit) a site to get to fruition.

For more information about humans.txt, check out the initiative’s home page at http://humanstxt.org/.

December 31st, 2011
Hot Desk
Creative Commons License photo credit: mikecogh

This year started off with a foray into Rails, an experience I won’t be rushing to complete. Most of my time was spent building a small application during the Christmas break in 2010, but in 2011 I moved that site into production and wrote a little bit about separating production and development values (a feat I repeated for the Play! framework, which I actually like, later in the year). I think the only thing I really like from Rails, and this is a bit of a stretch, is the database migrations.

Having moved entirely over to a LAMP platform professionally, and getting good at the security nuances plus everything else, I reminisced a little about some of the creature comforts I missed in C#, like operators for default null variables. But when I discovered Time Machine on my Mac, there was no going back – until the company switched directions and I got thrown back into .NET development.

That’s right – back to .NET, and deep into the Windows Azure cloud. I dealt with things like figuring out which is better – table storage or SQL Azure, and figuring out the nuances of their multiple SLAs, and how to ensure we actually have Azure Compute instances on-line when it hits the fan. At this point I have a pretty good handle on Azure’s strengths and weaknesses and my overall impression of the platform is very positive. If I continue building sites on the Microsoft stack, I would definitely continue to use Azure – it seems more expensive than other options at first glance, but it has some serious computing power behind it and takes the majority of administration headaches off of my plate. It really has enabled me to, for the most part, just focus on development whereas I was spending an increasing amount of my work day on system administration issues when supporting the LAMP platform.

Scalability and high availability have been on my mind a lot, and I’ve been looking into some more ‘off-beat’ database solutions like Drizzle for my transactional needs, as well as speeding up existing deployments by moving as much as possible into RAM. There’s always a battle between changing the way we work to take advantage of the new paradigm or changing our existing configuration to get some more life out of it.

The whole cloud computing buzz feels tired but has enabled a whole new class of online business. If you have a lot of commodity hardware you can achieve, very cheaply, feats that were only possible with an expensive dedicated network just a few years ago. Sure, it adds a lot of new choke points you will need good people to help sort through, which is giving rise to a whole new sub-category of programmer specialization to make hiring in 2012 even more challenging.

There is a downside to all the cloud computing, though, as we learned during the high profile Amazon failures – backups are important. This includes geographically-redundant systems that most organizations don’t have the experience to deal effectively with just yet. Even so, the biggest lesson I learned was never let your server run into swap space or your performance will nose-dive. The growth of this site even prompted me to move more of the site into memory which prevented me from needing to spend a lot of money upgrading my infrastructure.

Social media continues to grow, with companies realizing they can’t control its effect on their business in traditional ways and less-than-useless cons ruining it for everyone by selling CEOs on cheap gimmicks.

Since my third child was born in February, I’ve definitely taken some time to sotp and reflect on what I want to work on, why I want to keep working, and what the next steps are career-wise and life-wise. I want to provide the best that I can for my family and 2012 is going to see a radical course change as I start to shift gears and begin building something that will really last, even outlast me. When will my website start paying my bills? I don’t expect it will.

I learned a lot by running dozens (over a hundred?) job interviews in the past two years. Ignoring the old adage that you shouldn’t judge a book by it’s cover, I learned that you can tell with pretty good accuracy whether or not someone will be a good match for your company within the first five minutes of an interview. I’m less interested in hiring people with domain knowledge than I am in surrounding myself with the most intelligent developers I can find – one is a skill that can be taught, the other is an aptitude candidates need to bring to the table. Really, when it comes down to it, what I really want is for people I hire to stand up for themselves (since they are adults) and make me look good by being awesome at what they do.

I also learned a lot by being responsible for some very large projects; things like the importance of continuous integration.

What’s next in 2012? Look for mobile device use to continue growth – every developer who plans to stay employed needs to know something about mobile development, because it’s going to be ubiquitous with regular desktop programming very soon. Now that version 0.6 has been released with Windows support is Node.js ready for prime-time? I had the opportunity to play with it a lot over the past few month – look for a book early in the new year co-authored by yours truly.