Andre Borie

Adding DB columns in Django without downtime

Let's say we have the following model and we have the current version of the application already running and writing to it:

class LogRecord(models.Model):
    timestamp = models.DateTimeField(auto_now_add=True)
    message = models.TextField()

We'd like to add a severity column to it in a zero-downtime fashion, meaning we will not interrupt the currently-running app (it is running some mysterious and important work and cannot be interrupted):

class LogRecord(models.Model):
    timestamp = models.DateTimeField(auto_now_add=True)
    message = models.TextField()
    severity = models.IntegerField(default=0)

Now, the sudden presence of the extra column won't be a problem for the existing instances reading records - the ORM will silently ignore them and your application-level code won't even be aware of it.

Writing records however, will fail - the database now expects every INSERT statement to include a value for that new column, but the existing app version is unaware of it.

Now you might think the default=0 already covers us, but default only applies at the Django level; when you instantiate a model, the field will take on that default value in Python, and so when you write to the DB this value is persisted there.

This does not happen at the database level, if your app does not set a value for this field (our existing instances won't, as they don't have the new code yet), the database will reject the write.

If we look at the SQL that would be issued, we can see it clearly:

$ ./manage.py sqlmigrate demoapp 0002
BEGIN;
--
-- Add field severity to logrecord
--
ALTER TABLE "demoapp_logrecord" ADD COLUMN "severity" integer DEFAULT 0 NOT NULL;
ALTER TABLE "demoapp_logrecord" ALTER COLUMN "severity" DROP DEFAULT;
COMMIT;

(the DEFAULT ... DROP DEFAULT is just used to backfill existing rows with the value and is ultimately irrelevant as it ends up being dropped)

If we applied this, existing instances of the app would immediately explode with database errors while trying to write to this table.

There are a few ways to solve this problem:

Nullable column

One solution is to make the column nullable at the beginning. The database will tolerate INSERTs missing it (from the existing version of the app). The process in this case would be as follows:

  1. Deploy new app version with the nullable column. This creates the column in the database; earlier app version continue as normal.

  2. Wait for all old instances to exit. At this point we have nothing that relies on the column being nullable, so we can start doing something about it. But we still have null values in there from old rows, so we need to take care of that too.

  3. Make a new app version where you backfill the existing null rows to a non-null value (using a data migration), make the field non-nullable at the model level, and deploy that.

This is workable, but we needed at least 2 deploys before the column is fully populated and usable. Can we do better?

Database defaults

We could set a database-level default, so that it will tolerate INSERTs from our existing app instances that lack the value and use the default instead.

Django >= 5

In Django 5 the db_default parameter was introduced which makes this easy:

severity = models.IntegerField(default=0, db_default=0)

The generated SQL is as follows:

$ ./manage.py sqlmigrate demoapp 0002
BEGIN;
--
-- Add field severity to logrecord
--
ALTER TABLE "demoapp_logrecord" ADD COLUMN "severity" integer DEFAULT 0 NOT NULL;
COMMIT;

We can deploy this and we're done. The column is fully populated and usable from a single deploy.

We can go around and remove the db_default in a subsequent migration if we want to, but it should not affect anything in normal operation (I can see an edge-case where if we erroneously set severity to None on a model instance and try to save that, it will succeed, silently taking on the DB default, instead of failing and highlighting the error).

Django < 5

If we're unlucky and on earlier versions, we will need to replicate this with a custom migration:

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0001_initial'),
    ]

    operations = [
        migrations.AddField(
            model_name='logrecord',
            name='severity',
            field=models.IntegerField(default=0),
        ),
        migrations.RunSQL(
            sql="""ALTER TABLE demoapp_logrecord ALTER COLUMN severity SET DEFAULT 0""",
            reverse_sql=migrations.RunSQL.noop,
        ),
    ]

This gives us the following SQL:

./manage.py sqlmigrate demoapp 0002
BEGIN;
--
-- Add field severity to logrecord
--
ALTER TABLE "demoapp_logrecord" ADD COLUMN "severity" integer DEFAULT 0 NOT NULL;
ALTER TABLE "demoapp_logrecord" ALTER COLUMN "severity" DROP DEFAULT;
--
-- Raw SQL operation
--
ALTER TABLE demoapp_logrecord ALTER COLUMN severity SET DEFAULT 0;
COMMIT;

This leaves us with the same outcome the modern db_default option. We can deploy this and have a fully-populated column.

Note that in this case, Django's migration graph is unaware of this default we set manually; I don't believe it would be an issue (I think the only thing touching DB defaults pre-Django 5 is the initial column backfill upon adding one which doesn't apply here), but just in case I would recommend undoing it (and bringing back the DB state inline with Django's expectations) in a separate migration once all old instances of the app are terminated:

$ ./manage.py makemigrations --empty --name drop_severity_db_default demoapp
Migrations for 'demoapp':
  demoapp/migrations/0003_drop_severity_db_default.py

In 0003_drop_severity_db_default.py:

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0002_logrecord_severity'),
    ]

    operations = [
        migrations.RunSQL(
            sql="""ALTER TABLE demoapp_logrecord ALTER COLUMN severity DROP DEFAULT""",
            reverse_sql="""ALTER TABLE demoapp_logrecord ALTER COLUMN severity SET DEFAULT 0""",
        ),
    ]

django-syzygy

Finally the best option is to use a third-party library which will automate this busywork for us. It overloads the migration-related commands to automatically generate multiple migrations (pre-deploy and post-deploy) when needed, and gives us separate commands to run them at the appropriate stages of our deployment process.

If we try our example above with syzygy installed, we get:

$ ./manage.py makemigrations
Migrations for 'demoapp':
  demoapp/migrations/0002_logrecord_severity.py
    + Add field severity to logrecord
  demoapp/migrations/0003_drop_db_default_logrecord_severity.py
    ~ Drop database DEFAULT of field severity on logrecord

0002_logrecord_severity.py:

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0001_initial'),
    ]

    operations = [
        migrations.AddField(
            model_name='logrecord',
            name='severity',
            field=models.IntegerField(db_default=0, default=0),
        ),
    ]

0003_drop_db_default_logrecord_severity.py:

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0002_logrecord_severity'),
    ]

    operations = [
        syzygy.operations.AlterField(
            model_name='logrecord',
            name='severity',
            field=models.IntegerField(default=0),
            stage=syzygy.constants.Stage['POST_DEPLOY'],
        ),
    ]

It essentially did the whole "add/remove DB default dance" for us. We would deploy this as follows:

As a bonus, syzygy can also handle the reverse of this (zero-downtime removal of a column):

$ ./manage.py makemigrations
Migrations for 'demoapp':
  demoapp/migrations/0004_set_db_default_logrecord_severity.py
    ~ Set database DEFAULT of field severity on logrecord
  demoapp/migrations/0005_remove_logrecord_severity.py
    - Remove field severity from logrecord

It sets a DB default in the pre-deploy migration, meaning from that point on writes from the new version of the app which lack the column will still succeed:

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0003_drop_db_default_logrecord_severity'),
    ]

    operations = [
        syzygy.operations.AlterField(
            model_name='logrecord',
            name='severity',
            field=models.IntegerField(db_default=0, default=0),
            stage=syzygy.constants.Stage['PRE_DEPLOY'],
        ),
    ]

And finally drops the column for good in the post-deploy migration (to be ran once all existing instances have terminated):

class Migration(migrations.Migration):

    dependencies = [
        ('demoapp', '0004_set_db_default_logrecord_severity'),
    ]

    operations = [
        migrations.RemoveField(
            model_name='logrecord',
            name='severity',
        ),
    ]

#tech