Saturday 29 December 2007

On web application configuration

There are two common approaches for configuring web applications:

  • Configuration file(s) in the web application's directory
  • Storing configuration data in the database

What sort of data am I talking about:

  • Email setup (where to send response form, error emails etc, whether to send mail to customers etc)
  • Setup of things which need to vary between development and production - payment service provider configuration, setup of other third party integration things
  • Error handler configuration (whether to display errors, where to log them, email them etc)
  • Enabling / disabling features (for example, debug things)
  • Visual stuff or branding (labels etc)

Clearly the database connection string needs to be stored somewhere too, but that can't be in the database for obvious reasons.

Database configuration

pros: does not need to be configured on each web server of a farm; not accidentally overwritten by new deployments
cons: bad for hierarchical or structured data; performance overhead; difficult to maintain configuration data stored in a database in SCM. Modifications require apply/ rollback scripts

Configuration files

pros: Easy to manage in SCM (like all your other files, even if not in the same place); less runtime performance overhead; better for hierarchical data; database connection string can be held with other items.
cons: Needs to be synchronised across a web farm; care should be taken that the right versions are always deployed everywhere.

Ideas

For a larger application with a web farm, using a database for some configuration is clearly beneficial. If you have many servers and/or servers of other types (non-web servers), they can store all their configuration in a central database. If performance is a problem, copies can be cached locally.

For a small simple application, configuration files are better- they're more manageable generally. I prefer to put most configuration in a file if possible.

Monday 3 December 2007

mysql dumps are not text files

Today's pearl of wisdom:

MySQL dump files, as produced by mysqldump, are not text files.

A common mistake is to think that mysqldump files are text files. While they superficially look like text files, they in fact are not. This leads to confusion and problems if people try to edit them in a text editor application, which will in many cases cause corruption.

MySQL dump files contain the SQL commands required to recreate the contents of a database (or subset thereof). However, these are SQL commands in arbitrary (binary) encoding and can consist of a mixture of different encodings (e.g. if you have BLOBs). This mixture is not safe to edit with a text editor, which expects a text file to contain character data in exactly one encoding (and will typically guess, ask the user or just use its default to determine which).

Suppose you use utf8 data in your database - then your mysqldump file will probably contains all of its strings in utf8. This is fine, until you get to a BLOB. This is a binary lump of data and will be full of sequences of bytes which are NOT LEGAL in utf8, therefore any editor attempting to make sense of them in utf8 won't succeed. It will therefore be forced to change them into something else (i.e. different from the original BLOB data) in order to load them into its buffers.

Saving this modified (or even unmodified) file back out will result in loss of data.

Summary

  • MySQL dump files aren't text files even though they look like them

  • Editing one in a text editor will probably lose data

  • They are often also too big for a text editor

  • And have lines too long