WordPress to_ping Query Optimization

The WordPress team has taken up the issue of performance optimization pretty seriously, and I look forward to the fruits of their efforts, but I’m also casting a critical eye on my own code. Thanks to caching and a hugely optimized query architecture, Scriblio is now performing better than ever, and I’m now looking at the next tier of problems.

First among them is a WordPress query that runs to find which posts have pingbacks or trackbacks waiting to be processed. As it was originally written, it looked like this:

``` $trackbacks = $wpdb->get_results(" SELECT ID FROM $wpdb->posts WHERE CHAR_LENGTH(TRIM(to_ping)) > 7 AND post_status = 'publish' "); ```

The problem is that the query requires a full table scan, and it can’t be optimized simply by creating an index on to_ping because CHAR_LENGTH(TRIM(to_ping)) > 7 requires inspecting every row and applying the functions before the result can be tested. As written, the query could take as long as 15 minutes to execute on a busy table of 350,000 rows.

Once I read through the code and determined that it’s very unlikely that cruft would be left in to_ping, or that any that was there would get removed once processed through the ping functions, I decided to try the query below. (Hat tip also to Andy Skelton for sanity checking my reading of the code.)

``` $trackbacks = $wpdb->get_results(" SELECT ID FROM $wpdb->posts WHERE to_ping <> '' AND post_status = 'publish' "); ```

A trac ticket has been created and patch added, but despite the optimized query, a table as large as I’m working with probably needs an index to keep things running smoothly. So here’s what I’ve got:

KEY ping_status ( to_ping(1), post_status )

I’ve not yet confirmed that this is the most optimal index, but queries now take a fraction of a second to complete.