Tracking contrib and core patches with schema changes

During performance and scalability reviews of sites, we regularly find ourselves submitting patches to contrib modules and core to resolve performance issues.

Most large Drupal installations we work with use a variation of this workflow to track patches:

  • Upload the patch to an issue on Drupal.org if it's not already there
  • Add the patch to a /patches folder in revision control
  • Document what the patch does, the Drupal.org nid, and a reference to the ticket in the client's issue tracker in /patches/README.txt
  • Apply the patch to the code base

(Drush make seems like a popular alternative to this, but most of the sites we work on predate that workflow so I've not actually tried it).

Applying patches is usually straightforward enough if only changing PHP code, however you can run into trouble with this method if you're changing the schema of a database table. In our case, 99% of the time when we need to make schema changes we're adding or modifying indexes.

Recently on a client site, I found they'd applied a core patch adding an index to the comment module (this one: http://drupal.org/node/336483). I discovered this when checking their comment module schema while reviewing another core patch, which adds a different index to the same table (http://drupal.org/node/243093).

With schema changes, the ordering of hook_update_N () matters greatly.

If you apply a patch that adds an update to a module, let's say comment_update_7004(), you have no guarantee that your patch will get committed before anyone else adds comment_module_7004() to that module. In this case with two patches competing for the same update number, that seems relatively likely.

Drupal will not run the same update twice, so this risks missing schema updates to the module entirely. This could put you in the very position of having to run those updates manually yourself, or work on your own upgrade path to get your site back into sync, not fun.

To avoid this, recently we've been using this workflow:

  • Upload the patch to an issue Drupal.org
  • Add a hook_schema_alter() in a custom module, with a @see to the Drupal.org issue nid
  • Add a custom update handler: mycustommodule_update_N() to add the index

This will avoid running into conflicts with update numbering, so that when other schema changes are added to the module, you'll still have those updates run.

Health warning
While this approach will ensure that you don't skip any core or contrib updates, it will not prevent updates being run twice - once from your custom module, once from the module itself.

If you're only adding indexes this is usually OK, but if in doubt, http://drupal.org/project/schema can show you any discrepancies. Drupal 7 has the handy db_index_exists() function, this should help when resolving index mismatches.

It should go without saying that you should test this all on dev and staging servers before making any changes to your live site. Additionally, indexes can usually be added or removed with less impact than other schema changes - other kinds of changes can get you into a lot more trouble.

Comments