Deploying Drupal With Git and Capistrano

Submitted on Mar 23, 2013, 4:10 p.m.

I've been hacking on Drupal  recently and so far, I like it a lot. I've also been reading a little about Drupal's deployment story, and decided that it might be fun to use Capistrano  to deploy the projects I'm working on.

Turns out I'm not the only one that's thought of it. Kim Pepper's approach  is excellent, and he's included his Capistrano tasks  on Github .

Here's my take, which is a little different from Kim's in so far as I am not that concerned about the 'Rail-isms' that come with the default Capistrano gem (for the moment at least).

Obviously you'll need a machine with Ruby installed as well as the Capistrano  gem. The Capistrano getting started  docs and handbook are also very helpful.

$ gem install capistrano

First we'll take a look at how I've organized the directories of the project - following a Rails-ish  layout.

The top level of the project on my development machine, including git repository and capistrano files looks like this:

.
├── .git
├── .gitattributes
├── .gitignore
├── Capfile
├── config
└── public

The Capfile is created by calling capify . in the project directory. The config subdirectory, and the deploy.rb file inside it are automatically created.

Here's the contents of the Capfile - which as you can see, simply loads the deploy file from the config directory.

# Capfile
load 'deploy'
# Uncomment if you are using Rails' asset pipeline
    # load 'deploy/assets'
load 'config/deploy' # remove this line to skip loading any of the default tasks

Equally important is the the .gitignore file which ignores local settings and the files directories for Drupal.

# .gitignore
# Ignore configuration files that may contain sensitive information.
public/sites/*/*settings*.php

# Ignore paths that contain generated content.
cache/
public/sites/default/files

I'm including Drupal core in the git repository, as well as of course all of the project modules. I'm pretty new to Drupal, so not 100% sure if this is the best way to do this, but in addition to the actual project development, I'm performing core, theme, and module updates locally first, testing them on my dev and test machines, and then using the updated git repo to deploy the updates to the production site via Capistrano. This means I'll never perform a direct module or theme update on the live site itself.

On the test and production servers, I'm symlinking to the files (assets) and cache directories, as well as settings.php (which are all excluded from the git repository).

Here's what the directory structure looks like on the production server - starting at /home/foo/public_html/foo.com:

.
├── backup
├── cache
├── current -> /home/foo/public_html/foo.com/releases/20130323004159
├── files
├── log
├── releases
├── settings.php
├── shared
└── tmp

And here's a description for what's in each directory (Note: all of these directories are above, and outside of the Apache site directory - which is 'public, inside of current - more on that soon...')

backup - scheduled mysqldump sql files, and the sql dump file that's created during a deploy.
cache - my Boost  site cache directory.
current - the currently deploy version of the site - which is a symlink to a directory that Capistrano creates for each deployment release.
files - all of the sites public and private assets, like uploaded images etc.
log - apache access and error logs.
releases - the Capistrano managed releases directory.
settings.php - the Drupal settings.php file, which is symlinked into current/public/sites/default/settings.php
shared - shared resources including the Capistrano created cached-copy of the git repository
tmp - a tmp directory.

Here's the structure inside the current directory:

.
├── Capfile
├── config
├── .git
├── .gitattributes
├── .gitignore
├── log -> /home/foo/public_html/foo.com/shared/log
├── public
├── REVISION
└── tmp

... which is nearly identical to our local development directory including the git repository and Capfile. The log, REVISION, and tmp directories are created by Capistrano.

The public directory is our live site and is the root of a standard Drupal installation and should look familiar to any Drupalist:

.
├── authorize.php
├── cache -> /home/foo/public_html/foo.com/cache
├── cron.php
├── favicon.ico
├── .htaccess
├── includes
├── index.php
├── install.php
├── misc
├── modules
├── profiles
├── robots.txt
├── scripts
├── sites
├── system -> /home/foo/public_html/foo.com/shared/system
├── themes
├── update.php
└── xmlrpc.php

Note that cache directory is what Boost thinks is its cache directory, but is actually symlinked to the cache directory above. The system directory is a 'Rail-ism' and can be ignored.

And here's the layout of the sites directory.. .

sites
├── all
│   ├── libraries
│   ├── modules
│   └── themes
└── default
    ├── files -> /home/foo/public_html/foo.com/files
    ├── settings.php -> /home/foo/public_html/foo.com/settings.php
    └── tmp -> /home/foo/public_html/foo.com/tmp

As you can see, files, settings.php and tmp are all symlinked back to the top of our directory tree.

Okay - and now the good news. Our capistrano deployment configuration is going to create all of this for us with just a few keystrokes from our local development or deployment machine.

Here's the complete deploy.rb file located inside the config directory:

NOTE: Updated to include the use of an environment variable or prompt for the target application, e.g:

cap deploy:drupal TARGET=live

#SSH and PTY Options
default_run_options[:pty] = true
ssh_options[:forward_agent] = true
set :use_sudo, false
set :port, 5144

#On Mac OS X
ssh_options[:compression] = "none"

#Application settings
set :user, "foo"

target_env = ENV['TARGET']

if target_env.nil?
 set(:target, Capistrano::CLI.ui.ask("Target name: ") )
else
  set(:target, target_env)
end

#Set the target
if target.nil? || target.length == 0
  set :application, "test.foo.com"
elsif target == "live"
  set :application, "foo.com"
else
  set :application, "#{target}.foo.com"
end

set :keep_releases, 5
set :drush_cmd, "drush"

#We need to use the --uri option for drush and cache clearing (see task:clear_all_caches)
#because the cache files and directries are created by the web worker process owner, (usually
#www-data for apache). Using the uri to invoke the cache clear, will invoke the cache clear
#under the web worker process, and not the user running this deployment.
set :drush_uri, "http://#{application}"

set :deploy_to, "/home/foo/public_html/#{application}"
set :app_path, "#{deploy_to}/current/public"
#Shared resources like the Boost cache, files, tmp, settings.php are located here
#and do not change across deployments. We symlink releases to these resources.
set :share_path, "#{deploy_to}"

set :scm, :git
set :repository,  "ssh://git@github.com:/foo/foo.git"
set :branch, "master"
set :deploy_via, :remote_cache

#Servers
role :web, "cloud01.foo.com"                          # Your HTTP server, Apache/etc
role :app, "cloud01.foo.com"                          # This may be the same as your `Web` server
role :db,  "cloud01.foo.com", :primary => true        # This is where Rails migrations will run

after 'deploy:rollback', 'deploy:drupal:link_filesystem', 'deploy:drupal:clear_all_caches'

namespace :deploy do
  task :start do ; end
  task :stop do ; end
  task :restart, :roles => :app, :except => { :no_release => true } do
    #Place holder for app restart - in Ruby apps this would touch restart.txt.
  end
  
  #Drupal application and project specific tasks.
  namespace :drupal do

    desc "Perform a Drupal application deploy."
    task :default, :roles => :app, :except => { :no_release => true } do
      site_offline
      clear_all_caches
      backupdb
      deploy.default
      link_filesystem
      updatedb
      site_online
    end

    desc "Place site in maintenance mode."
    task :site_offline, :roles => :app, :except => { :no_release => true } do
      run "#{drush_cmd} -r #{app_path} vset maintenance_mode 1 -y"
    end

    desc "Bring site back online."
    task :site_online, :roles => :app, :except => { :no_release => true } do
       run "#{drush_cmd} -r #{app_path} vset maintenance_mode 0 -y"
    end

    desc "Run Drupal database migrations if required."
    task :updatedb, :on_error => :continue do
      run "#{drush_cmd} -r #{app_path} updatedb -y"
    end

    desc "Backup the database."
    task :backupdb, :on_error => :continue do
      run "#{drush_cmd} -r #{app_path} sql-dump --result-file=#{deploy_to}/backup/release-drupal-db.sql" 
      #run "#{drush_cmd} -r #{app_path} bam-backup"
    end

    #desc "This should not be run on its own - so comment out the description.
    # "Recreate the required Drupal symlinks to static directories and clear all caches."
    task :link_filesystem, :roles => :app, :except => { :no_release => true } do
      commands = []
      commands << "mkdir -p #{app_path}/sites/default"
      commands << "ln -nfs #{share_path}/settings.php #{app_path}/sites/default/settings.php"
      commands << "ln -nfs #{share_path}/files #{app_path}/sites/default/files"
      commands << "ln -nfs #{share_path}/tmp #{app_path}/sites/default/tmp"
      commands << "ln -nfs #{share_path}/cache #{app_path}/cache"
      commands << "find #{app_path} -type d -print0 | xargs -0 chmod 755"
      commands << "find #{app_path} -type f -print0 | xargs -0 chmod 644"
      run commands.join(' && ') if commands.any?
    end

    desc "Clear all caches"
    task :clear_all_caches, :roles => :app, :except => { :no_release => true } do
      run "#{drush_cmd} -r #{app_path} --uri=#{drush_uri} cc all"
    end
  end  
end

The Drupal specific tasks are in a separate namespace :drupal.

cap -T shows us the complete list of Capistrano tasks:

cap deploy                         # Deploys your project.
cap deploy:check                   # Test deployment dependencies.
cap deploy:cleanup                 # Clean up old releases.
cap deploy:cold                    # Deploys and starts a `cold' application.
cap deploy:create_symlink          # Updates the symlink to the most recently deployed version.
cap deploy:drupal                  # Perform a Drupal application deploy.
cap deploy:drupal:backupdb         # Backup the database.
cap deploy:drupal:clear_all_caches # Clear all caches
cap deploy:drupal:site_offline     # Place site in maintenance mode.
cap deploy:drupal:site_online      # Bring site back online.
cap deploy:drupal:updatedb         # Run Drupal database migrations if required.
cap deploy:migrate                 # Run the migrate rake task.
cap deploy:migrations              # Deploy and run pending migrations.
cap deploy:pending                 # Displays the commits since your last deploy.
cap deploy:pending:diff            # Displays the `diff' since your last deploy.
cap deploy:rollback                # Rolls back to a previous version and restarts.
cap deploy:rollback:code           # Rolls back to the previously deployed version.
cap deploy:setup                   # Prepares one or more servers for deployment.
cap deploy:symlink                 # Deprecated API.
cap deploy:update                  # Copies your project and updates the symlink.
cap deploy:update_code             # Copies your project to the remote servers.
cap deploy:upload                  # Copy files to the currently deployed version.
cap invoke                         # Invoke a single command on the remote servers.
cap shell       # Begin an interactive Capistrano session.

Note that the drupal:link_filesystem task is not shown, since its description has been commented out. It's not on the list because there should be no reason to run this task on its own.

Kim Pepper's approach  is to call the Drupal specific tasks from the after events in Capitrano's regular deployment process.

after "deploy:update_code", "drupal:symlink_shared", "drush:site_offline", "drush:updatedb", "drush:cache_clear", "drush:site_online"

This is probably wise since calling cap deploy will force the Drupal tasks to be run as well. For now at least I'm calling the default Drupal task via:

cap deploy:drupal

Of course before any of this will work you need to create the default /home/foo/public_html/foo.com directory and deployment account on your target application server, including the backup, files and tmp directories (as well as the cache directory if you're using Boost).

Calling cap deploy:check and cap deploy:setup will create the Capistrano managed directories. And for all of this to run smoothly, you'll likely need to copy the public SSH key for the deployment account into the ~/.ssh/authorized_keys file on the target application server, as well as the remote git repository account.

For changes that don't require a full deployment and new release directory, you can use the cap deploy:upload task to upload individual files to the live server, although needless to say this should be done with caution e.g:

deploy:upload FILES=public/sites/all/themes/foo_theme/style.css

And there it is. It may look like a lot, but the payoff is large. You get nice, safe automated deployments and rollbacks with just a few keystrokes.