Database Schema Management with git, Flask, SQLAlchemy, and Alembic

Posted on 15 December 2013 by Joseph

I'm a big fan of SQLAlchemy. Particularly when using Flask-SQLAlchemy, I find it gives me a lot of flexibility in creating data models. One spot it's lacking, though, is schema changes and database migrations.

After doing a bit of research, I came upon Alembic, a tool for managing database migrations as a collection of ordered Python scripts. (As a sidenote, the other main option is SQLAlchemy-Migrate). Alembic looked like it did all the things I needed:

It had a straightforward and easy-to-modify way of managing configuration options
It handled migrations in a human readable and human editable way
It could take a best guess at the migration needed based on changes to the ORM models

Perfect! With a hint from this handy guide, pulling configuration options (specifically, my database URL) from my Flask configuration files was easy.

A couple minor changes

Stamp your databases, my boy

Since I was starting from an already-released database, I had to take a couple more steps. First, I modified my database creation script. This script is only run in development, and it is used to wipe out the existing database and create a clean one. As Alembic uses a table to track the current database revision, this fresh database needs to be "stamped" with the latest Alembic revision. Luckily, this is simple:

from models import *
from alembic.config import Config
from alembic import command


if __name__ == "__main__":
    db.drop_all()
    db.create_all()
    alembic_cfg = Config("./alembic.ini")
    # This is the magic. Stamp the database with the latest db revision.
    command.stamp(alembic_cfg, "head")

Similarly, before you start committing any revisions, you'll want to stamp your production database. Assuming you have your configuration pulling correctly, you can do this from the command line:

alembic stamp head

Accept DB root password from command line

For security reasons, in my production environment the database user my application uses doesn't have the ALTER privilege. I prefer to instead do the migrations by hand. Previously, I have used available command line tools that allow me to specify the root password by on the command line without having to put it in a configuration file. When I started using Alembic in my production environment, I further modified the alembic/env.py file to similarly take a password from the command line. Between loading the Alembic config and using it, insert the following:

import getpass

url = app.config.get('SQLALCHEMY_ALEMBIC_URI',
                     app.config['SQLALCHEMY_DATABASE_URI'])
  # Parse the URL, look for root user
  parsed = urlparse.urlsplit(url)
  if parsed.username == "root":
      # Prompt for password (silently!)
      password = getpass.getpass("Password: ")
      fragments = list(parsed)
      b = fragments[1].split(":@")
      b.insert(1, password)
      fragments[1] = "%s:%s@%s" % tuple(b)
      # Re-create the URL
      url = urlparse.urlunsplit(fragments)
  
  # Set the (possibly modified) db url
  config.set_main_option('sqlalchemy.url', url)

Finally, you'll need to add the SQLALCHEMY_ALEMBIC_URI option to your production Flask config (which isn't in version control, right?). It should be set to the same as your SQLALCHEMY_DATABASE_URI except with the username/password set to "root:". Like so:

    SQLALCHEMY_DATABASE_URI = 'mysql://username:password@localhost/yourappdb'
    SQLALCHEMY_ALEMBIC_URI = 'mysql://root:@localhost/yourappdb'

With those changes, using Alembic in production will result in a prompt for your root password.

The workflow

Schema changes and migrations should not be done lightly and should be well tested before they are run against a production database. Here is an overview of the workflow I use for development that involves changes to the database schema:

Create a new git branch for the feature.
Within the branch, modify the models as required. Feel free to use drop_all(), create_all(), etc. as freely as you see fit. For me personally, my development config, prior to testing, uses SQLite, and I use a lot of rm test.db.
When you feel the branch is ready to go, change back to master and create a clean database using the old schema. This clean database will be stamped with the latest revision.
Merge the branch. I typically do a squashed merge so it's easy to back out if something is wrong, but it's up to you.
Use the Alembic --autogenerate feature to get a tentative revision: alembic revision --autogenerate -m "Your commit message". This will create a new migration script but not run it yet.
Take a look at the migration script, and modify it as needed. It's just Python, and you have the full compliment of SQLAlchemy tools, so go buckwild. If there is a problem, feel free to wipe out that script, modify models as needed, and re-run.
When you're confident the migration looks right, run alembic upgrade head. This will apply the migration script. Take a look at the resulting schema on the actual database and make sure nothing looks wrong. If it does, go to step 5.
When everything looks good, commit the new migrations script, along with the results of the feature branch.
Finally, run the new migration script against your production database, again with alembic upgrade head. Because the database was previously stamped, any outstanding migration scripts will be run.

This workflow has taken a lot of the pain, and more importantly, a lot of the fear out of database migrations. Because the master branch always contains a known working state, it's easy to back out from any changes. Of course, if something DOES go horribly wrong, you have a little comfort knowing you can downgrade with Alembic. Thanks to heavy testing, it hasn't come to that yet, but I rest easier knowing that it's possible.

Make your party unforgettable

Database Schema Management with git, Flask, SQLAlchemy, and Alembic

A couple minor changes

Stamp your databases, my boy

Accept DB root password from command line

The workflow

Share this!