Always Get Better

Archive for the ‘Web Programming’ Category

Speed Up Page Load By Tricking the Browser

Saturday, November 25th, 2017

Nettiquette is built into web browsers. When you go to download a page, its contents will load in at a max of 2 files at a time (by default). So if there are 12 CSS and JS files, you’ll only get 2 at a time until you load them all.

The good news is you can trick browsers into doing more at a time.

Enter subdomains.

Offload your CSS to css.yoursite.com, and your JavaScript files to js.yoursite.com. Now your website will load 6 files at a time (2 CSS files, 2 JavaScript files, 2 HTML/Image files from your main site).

It doesn’t matter if each subdomain is pointing to the same website. Web browsers go by the site URL and nothing more.

HTTP/2 is supposed to eliminate this problem, of course, but in the meantime doing this helps when you can’t crunch down your files any more.

Don’t Trust Your App to Node.js

Saturday, October 14th, 2017

One of the most common questions I get is around my bullishness toward Node.js. People assume because I wrote two books about it, I should be an expert in all things Node (nope!) or at least a major cheerleader for it (hah!).

My problem has never been with Node or even Javascript. I use both every day and will be the first to reach for them when a problem needs solved. Just not for web development. I’ll stick to PHP or .NET for that.

Again: Node is great, but it isn’t the platform for the web.

Don’t take my word for it. During a recent Mapping the Journey podcast, Ryan Dahl (Node’s creator) echoed these sentiments:

So, kind of the newer versions of Javascript has made this easier. That said, I think Node is not the best system to build a massive server web. I would use Go for that. And honestly, that’s the reason why I left Node. It was the realization that: oh, actually, this is not the best server-side system ever. (Source)

Bottom line is: pick your tech stack based on your objectives, not because something was cool today.

Case-insensitive string comparison in PHP

Sunday, September 29th, 2013

This is a common situation: I needed to compare two strings with unknown capitalization – “country” versus “Country”. Since these words should be considered equal even though the second “Country” has a capital “C”, we can’t do a straight comparison on the two – we need to do a case-insensitive comparison.

Option 1: strcasecmp
Whenever possible developers should try to use built-in functions which are compiled code and (in general) run much faster than anything you could write. PHP has strcasecmp to case-insentively compare two strings. Sounds like a perfect match!


if ( strcasecmp('country','Country') != 0 ) {
// We have a match!
}

Option 2: strtolower
Always read the documentation, but draw your own conclusions. One commentator in the PHP documentation suggested developers never use strcasecmp, and use strtolower with regular equality like this:

if ( strtolower('country') === strtolower('Country') ) {
// We have a match
}

Test the Speed
Both methods accomplish the same thing, but do we really want to skip using strcasecmp()? Which is the better option?
I wrote this short script to run each option for 10 seconds and see which is faster:

And the results:

strtolower: Done 18440869 cycles in 10 seconds
strcasecmp: Done 22187773 cycles in 10 seconds

So strcasecmp has the edge speed-wise, but not so huge that I would care to favour one over the other.

Apparantly strcasecmp does not support multi-byte (e.g. Unicode) characters, but I haven’t tested this. Presumably that would give strtolower an advantage over projects dealing with non-English input, however that is not the case at all in my particular use case so I did not explore this avenue any further. I also didn’t try against non-ascii characters, such as latin accents; including those would be an improvement on this test.

Using DateTime in the Play! Framework

Sunday, May 20th, 2012

Which data type should you use for time information on your Java models? The Date class is mostly deprecated, pushing us into the direction of the heavyweight Calendar class. Storing the milliseconds since Epoch in a Long is both database-friendly and easy to perform math on and convert at runtime. But if you really want a good experience, you are using Joda Time.

Joda Time is built into the Play! Framework, and there really is no excuse to use anything else. But when it comes to saving dates in your JPA Models, there is a big “gotcha” in that there is no default type converter to move a DateTime object into and out of the database. Oops.

But that can be fixed with an annotation in your model, like this:

@Type(type="org.joda.time.contrib.hibernate.PersistentDateTime")
public DateTime created;

Unfortunately, Play does not ship with Hibernate support for the DateTime object. So to make this work you need to include the Joda-Time-Hibernate library in your dependencies.yml file:


require:
- play
- joda-time -> joda-time-hibernate 1.3

After updating the dependencies file, run the play deps –sync command to pull in the libraries from maven. Your models will now save their date and time into MySQL, and your programming experience will be smooth as silk – at least as far as time-based functionality is concerned.

Re-Learning How to Write

Monday, April 30th, 2012

In just two weeks, Node: Up and Running will be released by O’Reilly Media. Writing a book has been a lot of hard work but also a terrific learning experience that I would love to repeat.

The biggest takeaway for me was how often I make stupid mistakes in my writing. As a developer and manager, I rely on my speaking and writing abilities every day – so I take my ability to express myself for granted because I have to do it every day.

When a professional editor takes a piece of writing, they aren’t looking at it in the same way a co-worker would. A co-worker knows me, understands some of the subtleties of the context I’m writing about, and can subconsciously apply meaning to ambiguities in the text or conversation. A casual reader doesn’t have the same context, and the copy editor is able to filter that out and make adjustments to the text that leave my meaning intact but change the delivery.

In other words, the text that came out of the editing process makes me look really smart (I wish!). I’ve learned the secret to clear communication is in keeping the message brief. Especially in a technical book, the audience can’t be expected to deconstruct prose – it’s up to the writer to make their point and get out of the way.

I’ve also learned that I use the same turns of phrases over and over again. Reading 50 pages of my own writing in a row with the same sentence transitions is boring as heck, and I’m able to see this strikingly clear when it’s annotated by a totally impartial writer.

log4php Performance

Wednesday, February 29th, 2012

We can take for granted that whenever we introduce a library or framework to our application, we incur an overhead cost. The cost varies depending on what we’re trying to do, but we generally accept that the lost performance is worth it for the increased maintainability, functionality or ease of use.

For many teams, logging is something that gets thrown in the mix at the last minute rather than through of all the way through. That’s a shame because a well-implemented logging solution can make the difference between understanding what is going on in your system and having to guess by looking at the code. It needs to be lightweight enough that the overall performance is not affected, but feature-rich enough that important issues are surfaced properly.

Java programmers have had log4j for a long time, and log4net is a similarly mature solution in the .NET world. I’ve been watching log4php for awhile and now that it has escaped the Apache Incubator it is impressively full-featured and fast. But how much do all its features cost?

Benchmarks
I’ll be looking into different options as I go, but let’s consider a very basic case – you append all of your events to a text file. I’ve created a configuration that ignores all ‘trace’ and ‘debug’ events so only events with a severity of ‘INFO’ or above are covered.

In 5 seconds, this is what I saw:

Test Iterations
BASIC (direct PHP) 45,421
INFO STATIC 45,383
INFO DYNAMIC 41,847
INFO STATIC (no check) 51,801
INFO DYNAMIC (no check) 47,756
TRACE STATIC 310,255
TRACE DYNAMIC 213,554
TRACE STATIC (no check) 271,043
TRACE DYNAMIC (no check) 196,653

Tests
What is all that? There are two ways to initialize the logger class – statically, meaning declared once and used again and again; and dynamically, meaning declared each time. With log4X, we typically perform a log level check first, for example isTraceEnabled() to determine whether to proceed with the actual logging work.

Results
I was surprised by how little log4php actually lost in terms of speed versus raw PHP. The authors have clearly done a thorough job of optimizing their library because it runs at 90% of the speed of a direct access.

I’ve always intuitively used loggers as static variables – initialize once and use over and over. This seems to be the right way by a huge margin.

Checking for the log level before appending to the log was a big win for the INFO messages, which are always logged to the file due to the configuration settings. The intended use is to allow programmers to sprinkle their code with debug statements which don’t get processed – and therefore slow down – the production code. I would be very happen with this in my project. In the INFO metrics, the check slowed things down a bit – explained because the actual logging function performs the same check – so we are taking a double hit. But wait, there is a good reason…

The TRACE metric is interesting – these are events which are NOT appended to the log. In that case, when the check is not performed, we pass through the code more times. When the check is performed, the code has to execute deeper on the call stack before it figures out we aren’t doing any actual logging, taking more time.

Conclusion
If you know you will be logging every single event, don’t do a check. Otherwise do the check – it will save a lot of wasted cycles.

Setting up WordPress with nginx and FastCGI

Monday, January 30th, 2012

All web site owners should feel a burning need to speed. Studies have shown that viewers waiting more than 2 or 3 seconds for content to load online are likely to leave without allowing the page to fully load. This is particularly bad if you’re trying to run a web site that relies on visitors to generate some kind of income – content is king but speed keeps the king’s coffers flowing.

If your website isn’t the fastest it can be, you can take some comfort in the fact that the majority of the “top” web sites also suffer from page load times pushing up into the 10 second range (have you BEEN to Amazon lately?). But do take the time to download YSlow today and use its suggestions to start making radical improvements.

I’ve been very interested in web server performance because it is the first leg of the web page’s journey to the end user. The speed of execution at the server level is capable of making or breaking the user’s experience by controlling the amount of ‘lag time’ between the web page request and visible activity in the web browser. We want our server to send page data as immediately as possible so the browser can begin rendering it and downloading supporting files.

Not long ago, I described my web stack and explained why I moved away from the “safe” Apache server solution in favour of nginx. Since nginx doesn’t have a PHP module I had to use PHP’s FastCGI (PHP FPM) server with nginx as a reverse proxy. Additionally, I used memcached to store sessions rather than writing to disk.

Here are the configuration steps I took to realize this stack:

1. Memcached Sessions
Using memcached for sessions gives me slightly better performance on my Rackspace VM because in-memory reading&writing is hugely faster than reading&writing to a virtualized disk. I went into a lot more detail about this last April when I wrote about how to use memcached as a session handler in PHP.

2. PHP FPM
The newest Ubuntu distributions have a package php5-fpm that installs PHP5 FastCGI and an init.d script for it. Once installed, you can tweak your php.ini settings to suit, depending on your system’s configuration. (Maybe we can get into this another time.)

3. Nginx
Once PHP FPM was installed, I created a site entry that would pass PHP requests forward to the FastCGI server, while serving other files directly. Since the majority of my static content (css, javascript, images) have already been moved to a content delivery network, nginx has very little actual work to do.


server {
listen 80;
server_name sitename.com www.sitename.com;
access_log /var/log/nginx/sitename-access.log;
error_log /var/log/nginx/sitename-error.log;
# serve static files
location / {
root /www/sitename.com/html;
index index.php index.html index.htm;

# this serves static files that exists without
# running other rewrite tests
if (-f $request_filename) {
expires 30d;
break;
}

# this sends all-non-existing file or directory requests to index.php
if (!-e $request_filename) {
rewrite ^(.+)$ /index.php?q=$1 last;
}
}

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /www/sitename.com/html$fastcgi_script_name;
include fastcgi_params;
}
}

The fastcgi_param setting controls which script is executed, based upon the root path of the site being accessed. All of the requests parameters are passed through to PHP, and once the configuration is started up I didn’t miss Apache one little bit.

Improvements
My next step will be to put a varnish server in front of nginx. Since the majority of my site traffic comes from search engine results where a user has not yet been registered to the site or needs refreshed content, Varnish can step in and serve a fully cached version of my pages from memory far faster than FastCGI can render the WordPress code. I’ll experiment with this setup in the coming months and post my results.