Always Get Better

Posts Tagged ‘PHP’

Case-insensitive string comparison in PHP

Sunday, September 29th, 2013

This is a common situation: I needed to compare two strings with unknown capitalization – “country” versus “Country”. Since these words should be considered equal even though the second “Country” has a capital “C”, we can’t do a straight comparison on the two – we need to do a case-insensitive comparison.

Option 1: strcasecmp
Whenever possible developers should try to use built-in functions which are compiled code and (in general) run much faster than anything you could write. PHP has strcasecmp to case-insentively compare two strings. Sounds like a perfect match!


if ( strcasecmp('country','Country') != 0 ) {
// We have a match!
}

Option 2: strtolower
Always read the documentation, but draw your own conclusions. One commentator in the PHP documentation suggested developers never use strcasecmp, and use strtolower with regular equality like this:

if ( strtolower('country') === strtolower('Country') ) {
// We have a match
}

Test the Speed
Both methods accomplish the same thing, but do we really want to skip using strcasecmp()? Which is the better option?
I wrote this short script to run each option for 10 seconds and see which is faster:

And the results:

strtolower: Done 18440869 cycles in 10 seconds
strcasecmp: Done 22187773 cycles in 10 seconds

So strcasecmp has the edge speed-wise, but not so huge that I would care to favour one over the other.

Apparantly strcasecmp does not support multi-byte (e.g. Unicode) characters, but I haven’t tested this. Presumably that would give strtolower an advantage over projects dealing with non-English input, however that is not the case at all in my particular use case so I did not explore this avenue any further. I also didn’t try against non-ascii characters, such as latin accents; including those would be an improvement on this test.

log4php Performance

Wednesday, February 29th, 2012

We can take for granted that whenever we introduce a library or framework to our application, we incur an overhead cost. The cost varies depending on what we’re trying to do, but we generally accept that the lost performance is worth it for the increased maintainability, functionality or ease of use.

For many teams, logging is something that gets thrown in the mix at the last minute rather than through of all the way through. That’s a shame because a well-implemented logging solution can make the difference between understanding what is going on in your system and having to guess by looking at the code. It needs to be lightweight enough that the overall performance is not affected, but feature-rich enough that important issues are surfaced properly.

Java programmers have had log4j for a long time, and log4net is a similarly mature solution in the .NET world. I’ve been watching log4php for awhile and now that it has escaped the Apache Incubator it is impressively full-featured and fast. But how much do all its features cost?

Benchmarks
I’ll be looking into different options as I go, but let’s consider a very basic case – you append all of your events to a text file. I’ve created a configuration that ignores all ‘trace’ and ‘debug’ events so only events with a severity of ‘INFO’ or above are covered.

In 5 seconds, this is what I saw:

Test Iterations
BASIC (direct PHP) 45,421
INFO STATIC 45,383
INFO DYNAMIC 41,847
INFO STATIC (no check) 51,801
INFO DYNAMIC (no check) 47,756
TRACE STATIC 310,255
TRACE DYNAMIC 213,554
TRACE STATIC (no check) 271,043
TRACE DYNAMIC (no check) 196,653

Tests
What is all that? There are two ways to initialize the logger class – statically, meaning declared once and used again and again; and dynamically, meaning declared each time. With log4X, we typically perform a log level check first, for example isTraceEnabled() to determine whether to proceed with the actual logging work.

Results
I was surprised by how little log4php actually lost in terms of speed versus raw PHP. The authors have clearly done a thorough job of optimizing their library because it runs at 90% of the speed of a direct access.

I’ve always intuitively used loggers as static variables – initialize once and use over and over. This seems to be the right way by a huge margin.

Checking for the log level before appending to the log was a big win for the INFO messages, which are always logged to the file due to the configuration settings. The intended use is to allow programmers to sprinkle their code with debug statements which don’t get processed – and therefore slow down – the production code. I would be very happen with this in my project. In the INFO metrics, the check slowed things down a bit – explained because the actual logging function performs the same check – so we are taking a double hit. But wait, there is a good reason…

The TRACE metric is interesting – these are events which are NOT appended to the log. In that case, when the check is not performed, we pass through the code more times. When the check is performed, the code has to execute deeper on the call stack before it figures out we aren’t doing any actual logging, taking more time.

Conclusion
If you know you will be logging every single event, don’t do a check. Otherwise do the check – it will save a lot of wasted cycles.

Setting up WordPress with nginx and FastCGI

Monday, January 30th, 2012

All web site owners should feel a burning need to speed. Studies have shown that viewers waiting more than 2 or 3 seconds for content to load online are likely to leave without allowing the page to fully load. This is particularly bad if you’re trying to run a web site that relies on visitors to generate some kind of income – content is king but speed keeps the king’s coffers flowing.

If your website isn’t the fastest it can be, you can take some comfort in the fact that the majority of the “top” web sites also suffer from page load times pushing up into the 10 second range (have you BEEN to Amazon lately?). But do take the time to download YSlow today and use its suggestions to start making radical improvements.

I’ve been very interested in web server performance because it is the first leg of the web page’s journey to the end user. The speed of execution at the server level is capable of making or breaking the user’s experience by controlling the amount of ‘lag time’ between the web page request and visible activity in the web browser. We want our server to send page data as immediately as possible so the browser can begin rendering it and downloading supporting files.

Not long ago, I described my web stack and explained why I moved away from the “safe” Apache server solution in favour of nginx. Since nginx doesn’t have a PHP module I had to use PHP’s FastCGI (PHP FPM) server with nginx as a reverse proxy. Additionally, I used memcached to store sessions rather than writing to disk.

Here are the configuration steps I took to realize this stack:

1. Memcached Sessions
Using memcached for sessions gives me slightly better performance on my Rackspace VM because in-memory reading&writing is hugely faster than reading&writing to a virtualized disk. I went into a lot more detail about this last April when I wrote about how to use memcached as a session handler in PHP.

2. PHP FPM
The newest Ubuntu distributions have a package php5-fpm that installs PHP5 FastCGI and an init.d script for it. Once installed, you can tweak your php.ini settings to suit, depending on your system’s configuration. (Maybe we can get into this another time.)

3. Nginx
Once PHP FPM was installed, I created a site entry that would pass PHP requests forward to the FastCGI server, while serving other files directly. Since the majority of my static content (css, javascript, images) have already been moved to a content delivery network, nginx has very little actual work to do.


server {
listen 80;
server_name sitename.com www.sitename.com;
access_log /var/log/nginx/sitename-access.log;
error_log /var/log/nginx/sitename-error.log;
# serve static files
location / {
root /www/sitename.com/html;
index index.php index.html index.htm;

# this serves static files that exists without
# running other rewrite tests
if (-f $request_filename) {
expires 30d;
break;
}

# this sends all-non-existing file or directory requests to index.php
if (!-e $request_filename) {
rewrite ^(.+)$ /index.php?q=$1 last;
}
}

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /www/sitename.com/html$fastcgi_script_name;
include fastcgi_params;
}
}

The fastcgi_param setting controls which script is executed, based upon the root path of the site being accessed. All of the requests parameters are passed through to PHP, and once the configuration is started up I didn’t miss Apache one little bit.

Improvements
My next step will be to put a varnish server in front of nginx. Since the majority of my site traffic comes from search engine results where a user has not yet been registered to the site or needs refreshed content, Varnish can step in and serve a fully cached version of my pages from memory far faster than FastCGI can render the WordPress code. I’ll experiment with this setup in the coming months and post my results.

Use a PHP Accelerator to Speed Up Your Website

Saturday, November 20th, 2010

I like PHP because it makes it really easy to quickly build a website and add functionality, and is generally lightning fast when executed without needing to wait for compilation as with ASP.NET or Java (Yes, we always pre-compile those languages prior to putting our applications into production, but with PHP we don’t even have to do that).

Even though compilation is very fast, it still has a resource and time cost, especially on high-traffic servers. We can improve our response times by more than 5x by pre-caching our compiled opcode for direct execution later. There are a few PHP accelerators which accomplish this for us:

Xcache
Xcache is my favourite and is the one I use in my own configurations. It works by caching the compiled PHP opcode in memory so PHP can be directly executed by the web server without expensive disk reads and processing time. Many caching schemes also use Xcache to store the results of PHP rendering so individual pages don’t need to be re-processed.

APC (Alternative PHP Cache)
APC is a very similar product to XCache – in fact XCache was released partially as a response to the perceived lag APC’s support for newer PHP versions. APC is essentially the standard PHP Accelerator – in fact, it will be included by default in PHP 6. As much as I like XCache, it will be hard to compete with built-in caching.

MMCache
Turck MMCache is one of the original PHP Accelerators. Although it is no longer in development, it is still widely used. An impressive feature of MMCache is its exporter which allows you to distribute compiled versions of your PHP applications without the source code. This is useful for those companies that feel they need to protect their program code when hosting in client environments.

eAccelerator
eAccelerator picked up where MMCache left off, and added a number of features to increase its usability as a content cache. Over time, the content caching features have been removed as more efficient and scalable solutions like memcache have allowed caches to be shared across web servers.

Keep Optimizing
One major consideration that often goes forgotten when optimizing website speeds that not all of your visitors will be using a high-speed connection; some users will be using mobile or worse connections, even for non-mobile sites. Every ounce of speed will reflect favourably on you and improve your retention rates – and ultimately get more visitors to your ‘call to action’ goals. I’ll go into more detail about bigger speed improvements we can make in a later post.