Sunday, 14 October 2007

mysql - ENUMs and sql_mode

I've been doing a few experiments with ENUMs and SQL_MODE.

Not a lot of people realise that in MySQL, an ENUM can actually contain one other value as well as the specified ones - no, not NULL, but ANOTHER value, specifically, the empty string ''.

This can lead to trouble. The empty string is entered by MySQL when it tries to insert an invalid value into the column. It does also give a warning, but nobody takes any notice of those right?

Demo schema:


CREATE TABLE enumtest (
id int not null auto_increment,
name varchar(100) NOT NULL,
status ENUM('ok','broken','decommissioned','narnia'),
PRIMARY KEY(id),
KEY(name),
KEY(status)
);


Now let's try some inserts...


INSERT INTO enumtest (name,status) VALUES
('Mark','ok'),
('Fred','broken'),
('Bob','decommissioned');


So far so good. Now how about:


mysql> insert into enumtest (name,status) VALUES ('bust','wibble');
Query OK, 1 row affected, 1 warning (0.00 sec)


Ok, so it inserted something, and we got a warning. What's there now?


SELECT * FROM enumtest;
+----+------+----------------+
| id | name | status |
+----+------+----------------+
| 1 | Mark | ok |
| 2 | Fred | broken |
| 3 | Bob | decommissioned |
| 4 | bust | |
+----+------+----------------+
4 rows in set (0.00 sec)


So what's happened is, we've attempted to insert an invalid value (wibble) and it's put an empty string in instead (empty string is not a null, as we defined the column NOT NULL).

So what's the fix? SET sql_mode to traditional:


mysql> SET SQL_MODE='TRADITIONAL';
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO enumtest (name,status) VALUES ('breakme','wibble');
ERROR 1265 (01000): Data truncated for column 'status' at row 1


Much better. We can now actually get ENUMs that only contain values they're supposed to. Good.

Conclusion: use sql_mode TRADITIONAL whenever possible.

Saturday, 6 October 2007

MySQL: delay_key_write is good

If you're using MySQL with the MyISAM engine for a high-writes application, delay_key_write is usually very good.

Let me explain further: delay_key_write is a table option which causes the database NOT to flush the MyISAM key file after every write. This is a really good thing, as if you're doing another write very soon anyway, this is likely to just waste I/O time.

This doesn't sound like a good idea right, because it means that if the power fails (or mysql crashes, or something), then you'll be left with a broken index file? No, it's still a very good idea:
  • delay_key_write does NOT appear to affect the MyISAM data file - that will still be flushed according to the normal policy.
  • If you had delay_key_write off, then a power failure or crash during the index write would cause the same level of corruption.
  • Broken MyISAM index files need to be rebuilt, regardless of how little or how much "brokenness" they actually have.
How do you turn delay_key_write on?

It's a table option, so you have to do it at create time. It's possible to do ALTER TABLE, but that behaves how it normally does under MySQL - i.e. it rebuilds the entire table, just to change one flag. This is of course necessary if you're changing the schema, but delay_key_write shouldn't - sadly, it does.

This means that on a large existing table, the ALTER TABLE needs to be done with care (i.e. out-of-hours or during scheduled down time, if at all).

Another option is to set the variable delay_key_write to "ALL" - however, this is not necessarily a good idea as it applies to all tables (in the entire server) whether you want it to or not.

On infrequently written tables, delay_key_write could increase the chance of index corruption.

Our performance testing cycle for delay_key_write is now over - quite frankly we're amazed by how much performance it gives us for so little effort - it will definitely be going into production.

Our servers are being hurt with heavy write-load which is keeping (at least one thread of) mysql almost permanently in I/O-wait - delay_key_write makes almost all of this go away making inserts much quicker and leaving the server more time for other things. It's likely to become essential once our server workload grows even higher.