Easy MySQL Performance Tips

Yes, I’m still trying to squeeze more performance out of MySQL. And since small changes to a query can make a big difference in performance…

Here are two really easy things to be aware of:

  • Never do a COUNT(*) (or anything *, says Zach). Instead, replace the * with the name of the column you’re searching against (and is hopefully indexed). That way some queries can execute entirely in the keycache (while * forces MySQL to read every matching row from the table).
  • When joining two large tables, but only searching against one, put the join statement at the end. Why join the two entire tables when you only have to join the matching rows?

I mention these because, well, I’ve known them forever, but upon seeing them again I realized I hadn’t really obeyed those simple rules in some of my queries.

Separately, there’s some pretty good info on what server variables affect what at mysqlperformanceblog too.

mysql, optimization, query optimization, performance, tips

5 thoughts on “Easy MySQL Performance Tips

  1. Unfortunately you are incorrect with regards to count(*); Zach’s article is correct.

    Why? count(*) is optimised by MySQL. The table descriptor is used without the need to read any table records in the case of count star with no where condition, and an index is used if possible if there is a where condition. Doing a count on an appropriately indexed column is fine too, as Zach notes, however there is the risk that should the choice of indices be changed in the future that the counted column is no longer optimal and there is then a big hit that may go unnoticed. So count(*) SHOULD be used.

    It is fair to say that selecting of * is generally a bad idea for queries that return rows as most often then returns more data than is needed, taking more time in the process. For non performance critical queries using * is arguably OK and has advantages of reduced maintenance.

    Using * during early stages of development may be justified on the basis that it may speed development if the database schema is still in flux. Doing so avoids the need to revise queries changing column names, deleting unused columns and adding new columns or ones that were forgotten. Once stable and before an application is released and finally tested, a code review should tidy up the queries.

    Nick

  2. true nick. sometimes * will be much faster like here. Small smaple shows index can be used for count and data does not have to be touched at all.

    mysql> explain select timestamp,count(*) from log_entry where timestamp = 2454944;
    +—-+————-+———–+——+—————+———–+———+——-+———+————————–+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +—-+————-+———–+——+—————+———–+———+——-+———+————————–+
    | 1 | SIMPLE | log_entry | ref | multi_one | multi_one | 5 | const | 1616103 | Using where; Using index |
    +—-+————-+———–+——+—————+———–+———+——-+———+————————–+
    1 row in set (0.00 sec)

    mysql> explain select timestamp,count(id) from log_entry where timestamp = 2454944;
    +—-+————-+———–+——+—————+———–+———+——-+———+————-+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +—-+————-+———–+——+—————+———–+———+——-+———+————-+
    | 1 | SIMPLE | log_entry | ref | multi_one | multi_one | 5 | const | 1616103 | Using where |
    +—-+————-+———–+——+—————+———–+———+——-+———+————-+
    1 row in set (0.00 sec)

  3. I just ran a simple check to test and can confirm. Selecting count(*) and count(id) (id is a key) yields about the same results. Selecting count(name) where `name` is non-indexed if far slower.

    Also note that selecting count(fieldname) doesn’t necessarily yield the number of rows in the table- just the number where fieldname is not null. So, naturally there’s some additional data-evaluating going on.

    (root@localhost:test) desc big;
    +——-+——————+——+—–+———+—————-+
    | Field | Type | Null | Key | Default | Extra |
    +——-+——————+——+—–+———+—————-+
    | id | int(11) unsigned | NO | PRI | NULL | auto_increment |
    | name | varchar(255) | YES | | NULL | |
    +——-+——————+——+—–+———+—————-+

    (root@localhost:test) select count(*) from big;
    +———-+
    | count(*) |
    +———-+
    | 10645312 |
    +———-+
    1 row in set (0.00 sec)

    (root@localhost:test) select count(id) from big;
    +———–+
    | count(id) |
    +———–+
    | 10645312 |
    +———–+
    1 row in set (0.00 sec)

    (root@localhost:test) select count(name) from big;
    +————-+
    | count(name) |
    +————-+
    | 10645312 |
    +————-+
    1 row in set (4.29 sec)

    (root@localhost:test) update big set name=NULL where name > ‘z';
    Query OK, 304167 rows affected (32.82 sec)
    Rows matched: 304167 Changed: 304167 Warnings: 0

    (root@localhost:test) select count(name) from big;
    +————-+
    | count(name) |
    +————-+
    | 10341145 |
    +————-+
    1 row in set (4.27 sec)

Comments are closed.