I'm a big fan of SQLAlchemy. Particularly when using Flask-SQLAlchemy, I find it gives me a lot of flexibility in creating data models. One spot it's lacking, though, is schema changes and database migrations.
After doing a bit of research, I came upon Alembic, a tool for managing database migrations as a collection of ordered Python scripts. (As a sidenote, the other main option is SQLAlchemy-Migrate). Alembic looked like it did all the things I needed:
- It had a straightforward and easy-to-modify way of managing configuration options
- It handled migrations in a human readable and human editable way
- It could take a best guess at the migration needed based on changes to the ORM models
Perfect! With a hint from this handy guide, pulling configuration options (specifically, my database URL) from my Flask configuration files was easy.
A couple minor changes
Stamp your databases, my boy
Since I was starting from an already-released database, I had to take a couple more steps. First, I modified my database creation script. This script is only run in development, and it is used to wipe out the existing database and create a clean one. As Alembic uses a table to track the current database revision, this fresh database needs to be "stamped" with the latest Alembic revision. Luckily, this is simple:
from models import *
from alembic.config import Config
from alembic import command
if __name__ == "__main__":
db.drop_all()
db.create_all()
alembic_cfg = Config("./alembic.ini")
# This is the magic. Stamp the database with the latest db revision.
command.stamp(alembic_cfg, "head")
Similarly, before you start committing any revisions, you'll want to stamp your production database. Assuming you have your configuration pulling correctly, you can do this from the command line:
alembic stamp head
Accept DB root password from command line
For security reasons, in my production environment the database user my application uses
doesn't have the ALTER privilege. I prefer to instead do the migrations
by hand. Previously, I have used available command line tools that allow
me to specify the root password by on the command line without having
to put it in a configuration file. When I started using Alembic in my
production environment, I further modified the alembic/env.py
file to
similarly take a password from the command line. Between loading the
Alembic config and using it, insert the following:
import getpass
url = app.config.get('SQLALCHEMY_ALEMBIC_URI',
app.config['SQLALCHEMY_DATABASE_URI'])
# Parse the URL, look for root user
parsed = urlparse.urlsplit(url)
if parsed.username == "root":
# Prompt for password (silently!)
password = getpass.getpass("Password: ")
fragments = list(parsed)
b = fragments[1].split(":@")
b.insert(1, password)
fragments[1] = "%s:%s@%s" % tuple(b)
# Re-create the URL
url = urlparse.urlunsplit(fragments)
# Set the (possibly modified) db url
config.set_main_option('sqlalchemy.url', url)
Finally, you'll need to add the SQLALCHEMY_ALEMBIC_URI
option to your
production Flask config (which isn't in version control, right?). It
should be set to the same as your
SQLALCHEMY_DATABASE_URI
except with the username/password set to
"root:". Like so:
SQLALCHEMY_DATABASE_URI = 'mysql://username:password@localhost/yourappdb'
SQLALCHEMY_ALEMBIC_URI = 'mysql://root:@localhost/yourappdb'
With those changes, using Alembic in production will result in a prompt for your root password.
The workflow
Schema changes and migrations should not be done lightly and should be well tested before they are run against a production database. Here is an overview of the workflow I use for development that involves changes to the database schema:
- Create a new git branch for the feature.
- Within the branch, modify the models as required. Feel free to use
drop_all()
,create_all()
, etc. as freely as you see fit. For me personally, my development config, prior to testing, uses SQLite, and I use a lot ofrm test.db
. - When you feel the branch is ready to go, change back to master and create a clean database using the old schema. This clean database will be stamped with the latest revision.
- Merge the branch. I typically do a squashed merge so it's easy to back out if something is wrong, but it's up to you.
- Use the Alembic
--autogenerate
feature to get a tentative revision:alembic revision --autogenerate -m "Your commit message"
. This will create a new migration script but not run it yet. - Take a look at the migration script, and modify it as needed. It's just Python, and you have the full compliment of SQLAlchemy tools, so go buckwild. If there is a problem, feel free to wipe out that script, modify models as needed, and re-run.
- When you're confident the migration looks right, run
alembic upgrade head
. This will apply the migration script. Take a look at the resulting schema on the actual database and make sure nothing looks wrong. If it does, go to step 5. - When everything looks good, commit the new migrations script, along with the results of the feature branch.
- Finally, run the new migration script against your production
database, again with
alembic upgrade head
. Because the database was previously stamped, any outstanding migration scripts will be run.
This workflow has taken a lot of the pain, and more importantly, a lot of the fear out of database migrations. Because the master branch always contains a known working state, it's easy to back out from any changes. Of course, if something DOES go horribly wrong, you have a little comfort knowing you can downgrade with Alembic. Thanks to heavy testing, it hasn't come to that yet, but I rest easier knowing that it's possible.