Always Get Better

Posts Tagged ‘MySQL’

Setting up WordPress with nginx and FastCGI

Monday, January 30th, 2012

All web site owners should feel a burning need to speed. Studies have shown that viewers waiting more than 2 or 3 seconds for content to load online are likely to leave without allowing the page to fully load. This is particularly bad if you’re trying to run a web site that relies on visitors to generate some kind of income – content is king but speed keeps the king’s coffers flowing.

If your website isn’t the fastest it can be, you can take some comfort in the fact that the majority of the “top” web sites also suffer from page load times pushing up into the 10 second range (have you BEEN to Amazon lately?). But do take the time to download YSlow today and use its suggestions to start making radical improvements.

I’ve been very interested in web server performance because it is the first leg of the web page’s journey to the end user. The speed of execution at the server level is capable of making or breaking the user’s experience by controlling the amount of ‘lag time’ between the web page request and visible activity in the web browser. We want our server to send page data as immediately as possible so the browser can begin rendering it and downloading supporting files.

Not long ago, I described my web stack and explained why I moved away from the “safe” Apache server solution in favour of nginx. Since nginx doesn’t have a PHP module I had to use PHP’s FastCGI (PHP FPM) server with nginx as a reverse proxy. Additionally, I used memcached to store sessions rather than writing to disk.

Here are the configuration steps I took to realize this stack:

1. Memcached Sessions
Using memcached for sessions gives me slightly better performance on my Rackspace VM because in-memory reading&writing is hugely faster than reading&writing to a virtualized disk. I went into a lot more detail about this last April when I wrote about how to use memcached as a session handler in PHP.

2. PHP FPM
The newest Ubuntu distributions have a package php5-fpm that installs PHP5 FastCGI and an init.d script for it. Once installed, you can tweak your php.ini settings to suit, depending on your system’s configuration. (Maybe we can get into this another time.)

3. Nginx
Once PHP FPM was installed, I created a site entry that would pass PHP requests forward to the FastCGI server, while serving other files directly. Since the majority of my static content (css, javascript, images) have already been moved to a content delivery network, nginx has very little actual work to do.


server {
listen 80;
server_name sitename.com www.sitename.com;
access_log /var/log/nginx/sitename-access.log;
error_log /var/log/nginx/sitename-error.log;
# serve static files
location / {
root /www/sitename.com/html;
index index.php index.html index.htm;

# this serves static files that exists without
# running other rewrite tests
if (-f $request_filename) {
expires 30d;
break;
}

# this sends all-non-existing file or directory requests to index.php
if (!-e $request_filename) {
rewrite ^(.+)$ /index.php?q=$1 last;
}
}

location ~ \.php$ {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /www/sitename.com/html$fastcgi_script_name;
include fastcgi_params;
}
}

The fastcgi_param setting controls which script is executed, based upon the root path of the site being accessed. All of the requests parameters are passed through to PHP, and once the configuration is started up I didn’t miss Apache one little bit.

Improvements
My next step will be to put a varnish server in front of nginx. Since the majority of my site traffic comes from search engine results where a user has not yet been registered to the site or needs refreshed content, Varnish can step in and serve a fully cached version of my pages from memory far faster than FastCGI can render the WordPress code. I’ll experiment with this setup in the coming months and post my results.

Using Memcache with MySQL

Monday, April 18th, 2011

MySQL+Memcache have been bedfellows for awhile and at this point are the de facto standard for highly-available, scalable websites. Even with other SQL and NoSQL solutions starting to become popular, this pair holds on as the winner for LAMP programmers. Is the complexity of working with this technology pair worth the investment?

Read vs Write
Traditional relational databases place the burden of computation on read operations. In mainframe environments with powerful servers and relatively few users, this made sense. Database normalization prevents redundancy, and data can be joined together when needed to produce the desired results.

In a web application with 1,000,000 users, the normalized transactional model does not perform. Generally speaking it is way faster to make two queries to a small subset of data rather than attempt expensive joins in a client-facing web site.

Enter memcache: by storing the result of our SQL queries in memory, we improve the speed of subsequent requests by pulling the data from memory as well as avoiding a hit to the database entirely, freeing it to process urgent or real-time requests.

Anatomy of an SQL Query
When we run an SQL query, we are actually asking the server to perform a lot of work:

  1. Break down the query into object references: The DBMS needs to understand which tables, columns, and filters you are using by tokenizing your SQL (by breaking out names from the keywords like SELECT, FROM, WHERE).
  2. Identify which indexes (if any) are most relevant to the data: This is harder with more complex queries which must be broken down or which depend on outer tables for their values.
  3. Read the source tables from the hard drive: Most DBMS implementations include some kind of memory caching which partially avoids this expensive read step, but some disk IO is a normal part of operation
  4. Join Columns: If we specify a join, especially a LEFT or RIGHT join, the DBMS has to create a pseudo table from the joined sources in memory before it can do any additional processing.
  5. Sub-selects: Any sub-select statements need to be processed. Depending on how the statement was written, this needs to be done for every row returned in the result set.
  6. Filter and sort: Anything in the ‘WHERE’ clause needs to be filtered out of the result set. This is where we are going to start seeing performance improvements by narrowing our result set.
  7. Aggregation: Once the database has its final result set it can do all of the aggregation we ask of it, both calculations and grouping

As we can imagine, this can be a time-consuming process. If it is repeated thousands of times in a short period, we will see significant performance loss.

Anatomy of a Cache Request
By contrast, when we perform a request for data from a cache like memcache, we do this:

  1. Check the index for the presence of the supplied key
  2. If a result exists, return it

In the case of memcache, this happens entirely in memory with no hit to the database whatsoever, resulting in a blindingly fast result set.

Speed Over Persistance
The reason memcache and MySQL work well as a pair is because they provide the tools needed to have reliable, persistent transactional storage (through MySQL) along with lightning fast data retrieval (through Memcache) especially for rarely-changing results.

Memcache as MySQL’s Hero

Sunday, April 10th, 2011

It’s hard not to love memcache. As soon as you manage a web site that has more than a few concurrent visitors, the performance benefit of caching becomes immediately obvious. MySQL is a fast database and can outperform a lot of its competitors, but no matter how quickly it can pull results it can never outperform the retrieval speed of the server’s RAM.

The basic premise is: instead of pulling a model out of the database, see if it has already been loaded into memory by checking a key-value diction (for example: User5677). If the user has not been read from the database yet, the key-value store will be empty and we can fetch the record. Next time we need that data we check the key-value again and avoid querying the database.

This really saves us whenever we have data that changes infrequently. Take, for example, an ecommerce website: since the products and categories on the site will change very rarely, it makes a lot of sense to store them in memory for fast recovery. Even more volatile information (like user data) can be stored in the cache, as long as the application knows to empty that cache key when the data gets changed.

Memcache is an ideal tool for managing these kinds of caches, and provides a lot of flexibility for growth.

History Lesson
Earlier this week I promised to go deeper into memcache’s origins. Memcache was originally developed at Danga as a way to reduce the database load and improve the speed of LiveJournal.

Rather than developing a standalone server application, Danga’s engineers designed memcache to sit on lower-end hardware and on web servers where it would use a small amount of the overal memory. Memcache instances don’t talk to each other: the client machines are aware of all the memcache instances and attempt to write their information evenly to each. This allows memcache to scale almost limitlessly without adding significant overhead to the caching process.

When to Use
Quite simply: if you’re building an application for the LAMP stack, build in memcache support. When treated as a necessary component from the beginning, caching support adds almost zero overhead to development; however it will always pay off as soon as real world traffic is coming to your site.

Drizzle – MySQL for the Cloud

Thursday, March 24th, 2011

Drizzle – a lightweight fork of MySQL – has reached general availability. Drizzle’s design goals are to create a highly performant and module database engine tailored for cloud computing.

Some of the interesting features this database has to offer are:

  • No Views, Triggers, or Stored Procedures – I consider this a huge advantage. Stored Procedures were once a good thing when query optimizers were all but non-existant, but modern database systems perform this task extremely efficiently. Using stored procedures adds a second level of APIs to your application for what I would consider to be a rather dubious potential security benefit.
  • Sharding – The client protocol is designed to decide which database server to target based upon a hashing key. This is similar to how memcached handles its linear horizontal scaling – leaving this calculation to the client is a huge advantage to systems hosting a high number of concurrent visitors
  • Gearman Support – This is a terrific tool for spreading workload across machines. Use Gearman to handle logging in Drizzle, keeping the actual DB server available for database work

How Does Drizzle Compare to Percona?
Percona is a high-performance build of MySQL which promises to offer better performance than “stock” MySQL. It is built from a branch of publicly-available MySQL source code and enhanced by the folks at Percona. Percona maintains the same functionality as “real” MySQL, but attempts to achieve faster speeds.

How Does Drizzle Compare to MariaDB?
MariaDB is a different database engine started by the original creator of MySQL intended to replace MySQL using non-Oracle licensing. Anyone concerned with Oracle’s development practices or plans for the future of MySQL should consider switching to MariaDB. MariaDB implements all of the existing MySQL functionality plus more and provides a drop-in replacement for MySQL.

How to Move a WordPress Database

Saturday, April 4th, 2009

One of the most common requirements for web developers is the ability to switch code from development servers to live production environments. This can be tricky if you’re working with WordPress; moving the files is dead simple, but since WordPress uses canonical URLs you have to be careful if you are trying to transfer any of your database content.

Canonical URLs force your site to use the same base path (www.yourdomain.com rather than yourdomain.com). But if you are working in a development environment – e.g. the development site’s address isn’t the same as your web site, but is rather something like 127.0.0.1 – you need to be able to make the switch to WordPress without bringing your site down.

Since I end up having to look for this information so often, here are the steps I use to accomplish this amazing feat:

Download The Database

From the shell prompt of your server, dump WordPress’ MySQL database into a backup file:

mysqldump –-add-drop-table -uusername -ppassword databasename > mysqlbackup.DATE.sql

Move it over to the new server and run this command to overwrite your target:

mysql -udb##### -p -hinternal-db.s#####.gridserver.com db#####_dbname < mysqlbackup.DATE.sql

Update the Database Paths

Log into your MySQL database and issue this update command to ensure WordPress redirects to the new server:

UPDATE wp_options SET option_value = replace(option_value, ‘http://www.old-domain.com’, ‘http://www.new-domain.com’) WHERE option_name = ‘home’ OR option_name = ‘siteurl’;

Next update the post URLs:

UPDATE wp_posts SET guid = replace(guid, ‘http://www.old-domain.com’,’http://www.new-domain.com’);

Finally, update your posts’ content to fix any internal links:

UPDATE wp_posts SET post_content = replace(post_content, ‘http://www.old-domain.com’, ‘http://www.new-domain.com’);

That’s all! Repeat these steps when moving from production to development and vise-versa.

As I said I typically search for this information whenever I need to move WordPress sites. I find the SQL queries at: http://www.mydigitallife.info/2007/10/01/how-to-move-wordpress-blog-to-new-domain-or-location/

Drupal Stuck at Database Configuration

Monday, November 24th, 2008

When configuring Drupal 6.6 on a Windows XP/Apache/MySQL box, I ran into an issue whereby I would enter the database information on the Database Configuration screen, press the advance button, but be constantly redirected back to the Database Configuration screen.

The Drupal community indicates this is a problem with permissions – Drupal needs to be able to write to your site’s settings.php file.  All permissions appeared to be correct in my setup but I was still unable to continue.

The solution was to edit the settings.php file, putting in my database information manually.  Just look for this line:

$db_url = ‘mysql://username:password@localhost/databasename’;

And change the username, password and databasename parts.

Then return to the Database Configuration screen, enter the information again and continue.  The correct database information will be read from the settings file and the configuration will continue to the next step.

Happy hunting!