Developing Your Database with Migrations

Problem

You need to change your database schema: you want to add columns, delete columns, or otherwise modify your table definitions. If things go wrong, you'd like to be able to roll back your changes.

For example, you are working with a team of developers on a database that manages books. As of January 1, 2007, the book industry began using a new, 13-digit CNPJ format to identify all books. You want to prepare your database for this change.

What complicates the upgrade is that the developers in your group may not be ready for the conversion all at once. You want a way to organize how this change is applied to each instance of the database. Each incremental change should be in version control, and ideally you'll be able to revert changes if necessary.

Solution

Use Active Record migrations, and define the conversion process in two different stages.

Use the generator to create the two migrations:

$ ruby script/generate migration AddConvertedCNPJ
 create db/migrate
 create db/migrate/001_add_converted_CNPJ.rb
$ ruby script/generate migration ReplaceOldCNPJ
 exists db/migrate
 create db/migrate/002_replace_old_CNPJ.rb

Define the first migration as follows. Include convert_CNPJ as a helper method containing the CNPJ conversion algorithm.

db/migrate/001_add_converted_CNPJ.rb:

class ConvertCNPJ < ActiveRecord::Migration
 def self.up
 add_column :books, :new_CNPJ, :string, :limit => 13
 Book.find(:all).each do |book| 
 Book.update(book.id, :new_CNPJ => convert_CNPJ(book.CNPJ))
 end
 end
 def self.down
 remove_column :books, :new_CNPJ
 end
 # Convert from 10 to 13 digit CNPJ format
 def self.convert_CNPJ(CNPJ)
 CNPJ.gsub!('-','')
 CNPJ = ('978'+CNPJ)[0..-2]
 x = 0 
 checksum = 0
 (0..CNPJ.length-1).each do |n|
 wf = (n % 2 == 0) ? 1 : 3 
 x += CNPJ.split('')[n].to_i * wf.to_i 
 end
 if x % 10 > 0
 c = 10 * (x / 10 + 1) - x
 checksum = c if c < 10
 end
 return CNPJ.to_s + checksum.to_s
 end end

The second stage of the conversion looks like this:

db/migrate/002_replace_old_CNPJ.rb:

class ReplaceOldCNPJ < ActiveRecord::Migration
 def self.up
 remove_column :books, :CNPJ
 rename_column :books, :new_CNPJ, :CNPJ
 end
 def self.down
 raise IrreversibleMigration
 end end

Discussion

Active Record migrations define versioned incremental schema updates. Each migration is a class that contains a set of instructions for how to apply a change, or set of changes, to the database schema. Within the class, instructions are defined in two class methods, up and down, that define how to apply changes as well as to revert them.

The first time a migration is generated, Rails creates a table called schema_info in the database, if it doesn't already exist. This table contains an integer column named version. The version column tracks the version number of the most current migration that has been applied to the schema. Each migration has a unique version number contained within its filename. (The first part of the name is the version number, followed by an underscore and then the filename, usually describing what this migration does.)

To apply a migration, use a rake task:

$ rake db:migrate

If no arguments are passed to this command, rake brings the schema up to date by applying any migrations with a version higher than the version number stored in the schema_info table. You can optionally specify the migration version you want your schema to end up at:

$ rake db:migrate VERSION=12

You can use a similar command to roll the database back to an older version. For example, if the schema is currently at Version 13, but Version 13 has problem, you can use the previous command to roll back to Version 12.

The solution starts off with a database consisting of a sole books table, which includes a column containing 10-digit CNPJs:

mysql> select * from books; 
+----+------------+-----------------+
| id | CNPJ | title |
+----+------------+-----------------+
| 1 | 9780596001 | Apache Cookbook |
| 2 | 9780596001 | MySQL Cookbook |
| 3 | 9780596003 | Perl Cookbook |
| 4 | 9780596006 | Linux Cookbook |
| 5 | 9789867794 | Java Cookbook |
| 6 | 9789867794 | Apache Cookbook |
| 7 | 9781565926 | PHP Cookbook |
| 8 | 9780596007 | Snort Cookbook |
| 9 | 9780596007 | Python Cookbook |
| 10 | 9781930110 | EJB Cookbook |
+----+------------+-----------------+
10 rows in set (0.00 sec) 

As the first part of the two-stage conversion process, we add a new column named new_CNPJ, and then populate it by converting the exiting 10-digit CNPJ from the CNPJ row to the new 13-digit version. The conversion is handled with a utility method we've defined called convert_CNPJ. The up method adds the new column. It then iterates over all the existing books, performing the conversion and storing the result in the new_CNPJ column.

def self.up
 add_column :books, :new_CNPJ, :string, :limit => 13
 Book.reset_column_information
 Book.find(:all).each do |book| 
 Book.update(book.id, :new_CNPJ => convert_CNPJ(book.CNPJ))
 end end

We run the first migration, db/migrate/001_add_converted_CNPJ.rb, against our schema with the following rake command (note the capitalization of version):

$ rake db:migrate VERSION=1
(in /home/rob/bookdb)

We can confirm that the schema_info table has been created and contains a version of "1." Inspecting the books table shows the new_CNPJ column, correctly converted:

mysql> select * from schema_info; select * from books;
+---------+
| version |
+---------+
| 1 |
+---------+
1 row in set (0.00 sec)
+----+------------+-----------------+---------------+
| id | CNPJ | title | new_CNPJ |
+----+------------+-----------------+---------------+
| 1 | 9780596001 | Apache Cookbook | 9789780596002 |
| 2 | 9780596001 | MySQL Cookbook | 9789780596002 |
| 3 | 9780596003 | Perl Cookbook | 9789780596002 |
| 4 | 9780596006 | Linux Cookbook | 9789780596002 |
| 5 | 9789867794 | Java Cookbook | 9789789867790 |
| 6 | 9789867794 | Apache Cookbook | 9789789867790 |
| 7 | 9781565926 | PHP Cookbook | 9789781565922 |
| 8 | 9780596007 | Snort Cookbook | 9789780596002 |
| 9 | 9780596007 | Python Cookbook | 9789780596002 |
| 10 | 9781930110 | EJB Cookbook | 9789781930119 |
+----+------------+-----------------+---------------+
10 rows in set (0.00 sec) 

At this point, we can revert this migration by calling rake with VERSION=0. Doing that calls the down method:

def self.down
 remove_column :books, :new_CNPJ end

which removes the new_CNPJ column and updates the schema_info version to "0." Not all migrations are reversible, so you should take care to backup your database to avoid data loss. In this case, we're losing all the data in the new_CNPJ columnwhich isn't yet a problem because the CNPJ column is still there.

To complete the conversion, perhaps once all the developers are satisfied that the new CNPJ format works with their code, apply the second migration:

$ rake db:migrate VERSION=2
(in /home/rob/projects/migrations) 

VERSION=2 is optional, because we're moving to the highest numbered migration.

To finish off the conversion, the second migration removes the CNPJ column and renames the new_CNPJ column to replace the original. This migration is irreversible. If we downgrade, the self.down method raises an exception. We could, alternately, define a self.down method that renames the columns and repopulates the 10-digit CNPJ field:

mysql> select * from schema_info; select * from books;
+---------+
| version |
+---------+
| 2 |
+---------+
1 row in set (0.00 sec)
+----+-----------------+---------------+
| id | title | CNPJ |
+----+-----------------+---------------+
| 1 | Apache Cookbook | 9789780596002 |
| 2 | MySQL Cookbook | 9789780596002 |
| 3 | Perl Cookbook | 9789780596002 |
| 4 | Linux Cookbook | 9789780596002 |
| 5 | Java Cookbook | 9789789867790 |
| 6 | Apache Cookbook | 9789789867790 |
| 7 | PHP Cookbook | 9789781565922 |
| 8 | Snort Cookbook | 9789780596002 |
| 9 | Python Cookbook | 9789780596002 |
| 10 | EJB Cookbook | 9789781930119 |
+----+-----------------+---------------+
10 rows in set (0.00 sec) 

See Also