databases

Calculating Distance Between Points In MySQL

MySQL has some powerful, and perhaps underused spatial extensions, but the most interesting functions are still unimplemented: “Note: Currently, MySQL does not implement these functions…”

Among those as-yet unimplemented functions is DISTANCE(). Alternatives can be found here and here, though neither is clean or simple. I wonder if a simple MBRContains() is good enough, though…

MySQL On Multi-Core Machines

The DevShed technical tour explains that MySQL can spawn new threads, each of which can execute on a different processor/core. What it doesn’t say is that a single thread can only execute on a single core, and if that thread locks a table, then no other threads that need that table can execute until the locking thread/query is complete. Short answer: MySQL works well on multi-core machines until you lock a table.

Apache, MySQL, and PHP on MacOS X

p0ps Harlow tweeted something about trying to get an AMP environment running on his Mac. Conversation followed, and eventually I sent along an email that look sorta like this: If you’re running 10.4 (I doubt it, but it’s worth mentioning because I’m most familiar with it), here’s how I’ve setup dozens of machines for web […] » about 300 words

WordPress + Invalid URLs = Extra Database Queries

After reporting weirdness last week I finally sat down with a completely clean and virgin install of WordPress 2.3.2 and traced what happens when you make a permalink request for a non-existent URL. Here are two sets of URLs to use as examples and context: These are valid URLs: http://site.org/archives/101 http://site.org/page-name These are _not_ valid […] » about 400 words

Bits Of MySQL Query Syntax I’ve Learned This Week

Watching the WordPress hacker list this week, a couple messages related to selecting information about users schooled me on MySQL syntax. I obviously knew the following would work, but I’d previously used the UNION syntax in similar situations and somehow hadn’t thought of writing it this way:

``` SELECT (SELECT meta_value FROM wp_usermeta WHERE meta_key = 'first_name' AND user_id = 2) AS FIRST, (SELECT meta_value FROM wp_usermeta WHERE meta_key = 'last_name' AND user_id = 2) AS LAST, wp_users.* FROM wp_users WHERE wp_users.ID = 2 ```

That’s much cleaner to my thinking, though I’ve no idea which is more optimal. When somebody replied asking for a solution that would work in pre-MySQL 5, this was the response:

``` SELECT ID, user_login, FIRST.meta_value AS fname, LAST.meta_value AS lname FROM wp_users LEFT JOIN wp_usermeta AS FIRST ON (wp_users.ID = FIRST.user_id AND FIRST.meta_key = 'first_name') LEFT JOIN wp_usermeta AS LAST ON (wp_users.ID = LAST.user_id AND LAST.meta_key = 'last_name') ```

And the lesson to me here is that I didn’t realize the syntax allowed us to match multiple conditions for the JOIN. Makes sense, but I just hadn’t thought of it. Thanks go to Phil Williams and Otto for tipping me to these.

Easy MySQL Performance Tips

Yes, I’m still trying to squeeze more performance out of MySQL. And since small changes to a query can make a big difference in performance…

Here are two really easy things to be aware of:

  • Never do a COUNT(*) (or anything *, says Zach). Instead, replace the * with the name of the column you’re searching against (and is hopefully indexed). That way some queries can execute entirely in the keycache (while * forces MySQL to read every matching row from the table).
  • When joining two large tables, but only searching against one, put the join statement at the end. Why join the two entire tables when you only have to join the matching rows?

I mention these because, well, I’ve known them forever, but upon seeing them again I realized I hadn’t really obeyed those simple rules in some of my queries.

Separately, there’s some pretty good info on what server variables affect what at mysqlperformanceblog too.

MySQL Error 28: Temp Tables And Running Out of Disk Space

Bam: MySQL error 28, and suddenly my queries came to a stop. Error 28 is about disk space, usually the disk space for temp tables. The first thing to do is figure out what filesystem(s) the tables are on. SHOW VARIABLES LIKE “%dir%” will return a number of results, but the ones that matter are […] » about 300 words

MySQL Fulltext Tips

Peter Gulutzan, author of SQL Performance Tuning, writes in The Full-Text Stuff That We Didn’t Put In The Manual about the particulars of word boundaries, index structure, boolean searching, exact phrase searching, and stopwords, as well as offering a few articles for further reading (Ian Gilfillan’s “Using Fulltext Index in MySQL”, Sergei Golubchik’s “MySQL Fulltext Search”, Joe Stump’s “MySQL FULLTEXT Searching”). It’s one of a number of articles in the MySQL Tech Resources collection.

Catching Bugs Before They Catch You

I got itchy about magic quotes the other day because it’s the cause (through a fairly long cascade of errors) of some performance problems and runaways I’ve been seeing lately (pictured above). But I deserve most of the blame for allowing a query like this to run at all: SELECT type, data, count(*) AS hits […] » about 300 words

T2000 Unboxed And Online

My Sun T2000 is here, and with Cliff‘s help it’s now patched, configured, and online. (Aside: what’s a Sun Happy Meal?) I’ll second Jon‘s assessment that Sun really should put some reasonable cable adapters in the box, as the the bundle of adapters necessary to make a null modem connection to the box is ridiculously […] » about 200 words

WordPress Baseline Changes To Support WPopac

I’ve whittled things down to the point where the only baseline change from WordPress 2.0.2 is in the next_posts_link function of the wp-includes/template-functions-links.php file. The change is necessary because WPopac rewrites the SQL search queries in a way that’s incompatible with a piece of this function, but necessary for performance reasons.

Here’s how my version reads:

`

function next_posts_link($label='Next Page »', $max_page=0) {
	global $paged, $result, $request, $posts_per_page, $wpdb, $max_num_pages;
	if ( !$max_page ) {
			if ( isset($max_num_pages) ) {
				$max_page = $max_num_pages;
			} else {
				preg_match('#FROM\s(.*)\sGROUP BY#siU', $request, $matches);

				// added April 5 2006 by Casey Bisson to support WPopac
				// necessary because the preg_match above fails with some queries
				if(!$fromwhere)
					$fromwhere = $wpdb->posts;
				
				// changed April 5 2006 by Casey Bisson to speed the query by eliminating
				// the slow DISTINCT clause
				//$numposts = $wpdb->get_var(“SELECT COUNT(DISTINCT ID) FROM $fromwhere”);
				$numposts = $wpdb->get_var(“SELECT COUNT(*) FROM $fromwhere”);
				$max_page = $max_num_pages = ceil($numposts / $posts_per_page);
			}
	}
	if ( !$paged )
		$paged = 1;
	$nextpage = intval($paged) + 1;
	if ( (! is_single()) && (empty($paged) || $nextpage < = $max_page) ) {
		echo '<a href=“';
		next_posts($max_page);
		echo '”>'. preg_replace('/&([^#])(?![a-z]{1,8};)/', '&$1', $label) .'</a>';
	}
}

`

More bsuite Hacking

Update: bugfix release b2v6 available. Some conversations with Chow Kah Soon, who’s site is full of diversions from work , finally convinced encouraged me to solve some small problems that were giving him big trouble. Chow Kah Soon is in the lucky, but rare, position of having over 20,000 unique daily visitors to his site, […] » about 400 words

MySQL’s Slow Query Log

Zach suggested it last week, but it’s only now that I’ve gotten around to setting up MySQL’s slow query log.

It’s easy enough, you’ve just got to put a couple lines like this in your my.cnf (which is in /etc on my server):

log-slow-queries = /var/log/mysql/mysql-slow.log<br /> long_query_time = 10

This should get most people running, but this story in Database Journal offers a few more details. Potentially more useful is this guide to query and index optimization (though it’s probably a little out of date).

Performance Optimization

A couple notes from the past few days of tweaks and fixes:

  • Hyper-threading has a huge effect on LAMP performance.
     
  • From now on, I’ll have bad dreams about running MySQL without Query Caching in the way that I used to have nightmares about going to school wearing only my underwear. The difference is that big.
     
  • WordPress rocks, but it has some queries that will kill large databases. I’m playing with baseline when I fix ’em, but it’s worth it.
     
  • Being highly ranked for searches related to bears and bear lovers on the weekend that Grizzly Man airs on TV is both good and bad: it doubled my previous high for daily page-loads, but killed my server and taught me some lessons about sloppy coding.
     
  • The support techs are Lunarpages continue to pretty much rock.
     
  • Having friends who cut their teeth on high-performance PHP/MySQL and are willing to spend the night workshopping it with you is indispensable.
     
  • It’s hard to beat the calming beauty of driving home through a snowfall on back roads at 2 AM carried by your life’s soundtrack.
     

Plesk Bites

I picked Plesk over CPanel as my server control panel because it was cheaper, looked better, and seemed to have all the features I wanted. What I didn’t know was that it came with PHP4 and MySQL3 at times when each was a major version ahead of that. When the good folks at my hosting provider tried to upgrade this, it conflicted with Plesk and they have to back off.

The answer, it seemed, was that I’d have to migrate from Plesk to CPanel to get those features. And now I’ve got a big database project, that’s looking more necessary than ever.

Why? Because MySQL 3.x doesn’t support query caching, boolean full-text searching, or complex subqueries.

In a simpler world, everything would be up to date and working, but in this world I’m trying to find a convenient time to migrate my stuff to CPanel.

Learning: MySQL Optimization

I have over 1000 posts here at MaisonBisson, but even so, the table with all those posts is under 3MB. Now I’ve got a project with 150,000 posts — yes, 150,000 posts! — and the table is about 500MB. An associated table, structured sort of like WP’s postsmeta, has over 1.5 million records and weighs in at over 100MB (not including the 150MB of indexes).

Up to now I’ve been a “throw more hardware at it” sort of guy — and in a server with only 1GB of RAM, that’s probably the best solution — but I also think it’s time I learned some MySQL optimization tricks. Zach‘s been pushing me to get mytop installed for quite a while, and I finally got around to it.

mytop is a console-based (non-gui) tool for monitoring the threads and overall performance of a MySQL 3.22.x, 3.23.x, and 4.x server. It runs on most Unix systems (including Mac OS X) which have Perl, DBI, and Term::ReadKey installed. And with Term::ANSIColor installed you even get color. If you install Time::HiRes, you’ll get good real-time queries/second stats.

More NEASIS&T Buy Hack or Build Followup

First, Josh Porter, the first speaker of the day has a blog where he’s posted his presentation notes and some key points. Josh spoke about Web 2.0, and ended with the conclusion that successful online technologies are those that best model user behavior. “I think Web 2.0 is about modeling something that already exists in our offline worlds, mostly in the spoken words and minds of humankind.”

Interestingly, in findability terms, it was Josh’s post that clued me in that the event podcast was online because he linked to my blog in his post. Lesson: links make things findable.

Like Josh, I found my voice a little unfamiliar, but you can listen here (51MB) if that’s your thing.

Also, I demoed some features I’d like to see in a future OPAC, but to help people visualize them, I finally put together a graphical mockup of them here.

NEASIS&T Buy, Hack or Build Followup

I was tempted to speak without slides yesterday, and I must offer my apologies to anybody trying to read them now, as I’m not sure how the slides make sense without the context of my speech. On that point, it’s worth knowing that Lichen did an outstanding job liveblogging the event, despite struggling with a […] » about 600 words

I Will Crush You

Or, er, my server will be crushed. I guess I should admit that my stuff could do with some optimization, maybe. Perhaps what I really need is something faster than Celeron with 512MB RAM. Maybe. tags: 512mb ram, apache, break point, breaking point, celeron, crushed, load average, mysql, php, server, top, web server » about 100 words