Home  >  Article  >  Backend Development  >  Best Practices for Alembic and SQLAlchemy

Best Practices for Alembic and SQLAlchemy

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-11-01 10:20:30753browse

In this article, I’ll briefly go over some best practices that help keep projects organized, simplify database maintenance, and prevent common pitfalls when working with Alembic and SQLAlchemy. These techniques have saved me from trouble more than once. Here’s what we’ll cover:

  1. Naming Conventions
  2. Sorting Migrations by Date
  3. Table, Column, and Migration Comments
  4. Data Handling in Migrations without Models
  5. Migration Testing (Stairway Test)
  6. Service for Running Migrations
  7. Using Mixins for Models

1. Naming Conventions

SQLAlchemy allows you to set up a naming convention that’s automatically applied to all tables and constraints when generating migrations. This saves you from manually naming indexes, foreign keys, and other constraints, which makes the database structure predictable and consistent.

To set this up in a new project, add a convention to the base class so that Alembic will automatically use the desired naming format. Here’s an example of a convention that works well in most cases:

from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase

convention = {
    'all_column_names': lambda constraint, table: '_'.join(
        [column.name for column in constraint.columns.values()]
    ),
    'ix': 'ix__%(table_name)s__%(all_column_names)s',
    'uq': 'uq__%(table_name)s__%(all_column_names)s',
    'ck': 'ck__%(table_name)s__%(constraint_name)s',
    'fk': 'fk__%(table_name)s__%(all_column_names)s__%(referred_table_name)s',
    'pk': 'pk__%(table_name)s',
}

class BaseModel(DeclarativeBase):
    metadata = MetaData(naming_convention=convention)

2. Sorting Migrations by Date

Alembic migration filenames typically start with a revision tag, which can make the order of migrations in the directory appear random. Sometimes it’s useful to keep them sorted chronologically.

Alembic allows customizing the migration filename template in the alembic.ini file with the file_template setting. Here are two convenient naming formats for keeping migrations organized:

  1. Based on date:
file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(rev)s_%%(slug)s
  1. Based on Unix timestamp:
file_template = %%(epoch)d_%%(rev)s_%%(slug)s

Using date or Unix timestamps in filenames keeps migrations organized, making navigation easier. I prefer using Unix timestamps, and an example will be provided in the next section.

3. Comments for Tables and Migrations

For those working in a team, commenting attributes is a good practice. With SQLAlchemy models, consider adding comments directly to columns and tables instead of relying on docstrings. This way, comments are available both in the code and the database, making it easier for DBAs or analysts to understand table and field purposes.

class Event(BaseModel):
    __table_args__ = {'comment': 'System (service) event'}

    id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True),
        primary_key=True,
        comment='Event ID - PK',
    )
    service_id: Mapped[int] = mapped_column(
        sa.Integer,
        sa.ForeignKey(
            f'{IntegrationServiceModel.__tablename__}.id',
            ondelete='CASCADE',
        ),
        nullable=False,
        comment='FK to integration service that owns the event',
    )
    name: Mapped[str] = mapped_column(
        sa.String(256), nullable=False, comment='Event name'
    )

It’s also helpful to add comments to migrations to make them easier to find in the file system. A comment can be added with -m when generating the migration. The comment will appear in the docstring and the filename. This naming makes it much easier to locate the required migration.

from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase

convention = {
    'all_column_names': lambda constraint, table: '_'.join(
        [column.name for column in constraint.columns.values()]
    ),
    'ix': 'ix__%(table_name)s__%(all_column_names)s',
    'uq': 'uq__%(table_name)s__%(all_column_names)s',
    'ck': 'ck__%(table_name)s__%(constraint_name)s',
    'fk': 'fk__%(table_name)s__%(all_column_names)s__%(referred_table_name)s',
    'pk': 'pk__%(table_name)s',
}

class BaseModel(DeclarativeBase):
    metadata = MetaData(naming_convention=convention)

4. Avoid Using Models in Migrations

Models are often used for data manipulations, such as transferring data from one table to another or modifying column values. However, using ORM models in migrations can lead to issues if the model changes after the migration is created. In such cases, a migration based on the old model will break when executed, as the database schema may no longer match the current model.

Migrations should be static and independent of the current state of models to ensure correct execution regardless of code changes. Below are two ways to avoid using models for data manipulations.

  • Use raw SQL for data manipulation:
file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(rev)s_%%(slug)s
  • Define Tables Directly in the Migration: If you want to use SQLAlchemy for data manipulations, you can manually define tables directly in the migration. This ensures a static schema at the time of migration execution and will not depend on changes in the models.
file_template = %%(epoch)d_%%(rev)s_%%(slug)s

5. Stairway Test for Migration Testing

The Stairway Test involves progressively testing upgrade/downgrade migrations step-by-step to ensure the entire migration chain works correctly. This ensures each migration can successfully create a new database from scratch and downgrade without issues. Adding this test to CI is invaluable for teams, saving time and frustration.

Best Practices for Alembic and SQLAlchemy

Integrating the test into your project can be done easily and quickly. You can find a code example in this repository. It also includes other valuable migration tests that may be helpful.

6. Migration Service

A separate service for performing migrations. This is just one way to carry out migrations. When developing locally or in environments similar to development, this method fits in well. I’d like to remind you about the conditional depends_on feature, which is relevant here. We take the application image with Alembic and run it in a separate container. We add a dependency on the database with the condition that migrations start only when the database is ready to handle requests (service_healthy). Additionally, a conditional depends_on (service_completed_successfully) can be added for the application, ensuring it starts only after migrations have completed successfully.

class Event(BaseModel):
    __table_args__ = {'comment': 'System (service) event'}

    id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True),
        primary_key=True,
        comment='Event ID - PK',
    )
    service_id: Mapped[int] = mapped_column(
        sa.Integer,
        sa.ForeignKey(
            f'{IntegrationServiceModel.__tablename__}.id',
            ondelete='CASCADE',
        ),
        nullable=False,
        comment='FK to integration service that owns the event',
    )
    name: Mapped[str] = mapped_column(
        sa.String(256), nullable=False, comment='Event name'
    )

The depends_on condition ensures migrations run only after the database is fully ready and that the application starts after migrations are completed.

7. Mixins for Models

While this may be an obvious point, it’s important not to overlook it. Using mixins is a convenient way to avoid code duplication. Mixins are classes that contain frequently used fields and methods, which can be integrated into any models where they're needed. For instance, we often need created_at and updated_at fields to track the creation and update times of records. It can also be useful to use an id based on UUID to standardize primary keys. All of this can be encapsulated in mixins.

from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase

convention = {
    'all_column_names': lambda constraint, table: '_'.join(
        [column.name for column in constraint.columns.values()]
    ),
    'ix': 'ix__%(table_name)s__%(all_column_names)s',
    'uq': 'uq__%(table_name)s__%(all_column_names)s',
    'ck': 'ck__%(table_name)s__%(constraint_name)s',
    'fk': 'fk__%(table_name)s__%(all_column_names)s__%(referred_table_name)s',
    'pk': 'pk__%(table_name)s',
}

class BaseModel(DeclarativeBase):
    metadata = MetaData(naming_convention=convention)

By adding these mixins, we can include UUID id and timestamps in any model where needed:

file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(rev)s_%%(slug)s

Conclusion

Handling migrations can be challenging, but following these simple practices helps keep projects well-organized and manageable. Naming conventions, date sorting, comments, and testing have saved me from chaos and helped prevent mistakes. I hope this article proves helpful — feel free to share your own migration tips in the comments!

The above is the detailed content of Best Practices for Alembic and SQLAlchemy. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn