Using Rails console to copy records across databases

September 7, 2007

If you’re developing with Rails using PostgreSQL as your database back-end, you might find yourself in the same situation I was just in.

Rails, by convention, makes us think in terms of application ‘environments’: development, test and production (where I work we add “staging”). Each environment is given it’s own database (each environment can even have entirely different database servers).

During development, there’ll be times where you start populating a table with meaningful data as you develop and test. For example, you might have created a users table and added a couple of logins.

In some cases, data that’s meaningful in development can also be meaningful during staging or production. So, like me, one day you might find yourself wanting to copy some rows or an entire table from one database to another.

If I were using MySQL, I could easily do this via the SQL console. But PostgreSQL doesn’t support cross-database references (offering schemas, instead).

Fortunately, Rails comes with a nifty console which can do the job for us!

script/console ftw

You may not be aware of it (I wasn’t until about a month into Rails), but Rails has a console (which feels just like irb) which you can use to drive your app. When you launch the console, all your model classes and plugins are available to you just as they would be at run-time.

$ script/console
>>

By default, this launches the console running under the ‘development’ environment. Once we’ve launched the Rails console, we can issue a query to find all User objects, like so:

>> users = User.find :all
=> [#<User:0x358c82c @attributes={"name"=>"system", "id"=>"1"}>,
#<User:0x358c7f0 @attributes={"name"=>"aisrael", "id"=>"2"}>]

Nothing new here, it’s just the same (delicious) way we do queries in Rails.

Now, what may not be obvious is that you can tell ActiveRecord to reconnect to a different database at run-time!

If the database is configured in one of your environments in config/database.yml (say, ‘staging’) then it’s as easy as

>> ActiveRecord::Base.establish_connection(:staging)
=> #<ActiveRecord::Base::ConnectionSpecification:0x358a338
@adapter_method="postgresql_connection",
@config={:database=>"my_staging", :adapter=>"postgresql",
:username=>"postgres", :password=>"secret"}>

If you want to connect to a totally different database that’s not configured in database.yml, you should still be able to do so by manually passing in all the needed parameters to establish_connection.

Now that we’re connected to our staging database, we can insert the rows we got from development by simply doing:

>> users.each { |user| User.create user.attributes }

Unless any validation errors occur (which they shouldn’t – and if they do then that should send off alarms about your data model integrity or validation rules), or your database constraints are defined differently from one environment to another (which can also be a cause for concern) then it should all just work.

But wait, there’s more!

Sure, you say, if I was using MySQL I could do the same with one SQL command. Granted – and I wish PostgreSQL would allow that (without needing another plugin).

But the nice thing about this approach is, like I mentioned, Rails doesn’t just let you switch from one database on the same server. You can have your development server running on PostgreSQL on your local machine, and your production/staging server running on Oracle on a network machine and the above should still work.

Another thing you can do is filter or massage the data. Again, we can do most of that in SQL but isn’t that the whole reason we’re using Ruby (and Rails) in the first place – to have a richer language for writing our queries and manipulating our data that’s NOT SQL?

Of course, this technique is only good for those ‘one time’ data movements. If you need to populate a database regularly after creation, see my previous entry about bootstrapping your Rails database with fixtures.

(So, I just got lazy but wanted to blog about it and appear clever. Well, one of these days I might just write up something to automate the above and maybe throw in automatic discovery/migration of dependencies/associations. ‘Cept I can’t call it ActiveRecord Migrations. :p )

Advertisements

5 Responses to “Using Rails console to copy records across databases”

  1. Chetan Patil Says:

    Hi,

    Thanks for writing this up. I found that if I use create, the id of the original model data is not preserved and the foreign key relationships get messed up.

    Instead I use Model.new to create an object and then explicitly set the id field.

    I also write up a rake task for it.

    More here

    http://justbarebones.blogspot.com/2007/10/copy-model-data-between-databases.html

    – c

  2. Alistair Says:

    Hi, Chetan.

    Good catch! I didn’t consider that the ids weren’t going to be preserved by simply using Model.create. When I first tested this, I was only copying a small table into a newly created database and so I guess the auto-generated primary keys just happened to ‘line up’ properly with the original record ids.

    On the other hand, if you’re copying all associations along then it can be argued that the actual ids can be irrelevant as long as (all) the foreign key relationships are kept intact. Of course, circumstances can be very different in practice from in theory.

    Anyway, thanks for the comment and good job on taking this idea and going all the way (up to a rake task) with it!

  3. Alex Egg Says:

    “You can have your development server running on PostgreSQL on your local machine, and your production/staging server running on Oracle on a network machine and the above should still work.”

    The real challenge is getting this to work across the internet — say you want to push data from staging to production and they are both 2 different remote computers.

    Any ideas?

  4. Jean-Marc Says:

    Alex,

    there is nothing preventing you from configuring a database connection that is not local to the Rails application server.

    this way you can send the data accross any machine on any network (providing it can accept the connection).

    the other alternative would be to use ActiveResource… but that implies having 2 models loaded… one for the local database, the other for the remote connected (it also implies having a controller setup on the other side to accept the data).

  5. Dan Says:

    Thanks,

    I had a bunch of data in my mysql database and I wanted to switch to sqlite – this worked like a charm.

    -Dan


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: