- publishing free software manuals

Articles > Making and Applying Patches - A short introduction to GNU diff and patch

Ciaran O'Riordan, Stephen Compall, Brian Gough (Network Theory Ltd)

Introduction

The purpose of this article is to provide a tutorial introduction to making and using patches with the GNU commands diff and patch.

For more information see the printed manual "Comparing and Merging Files with GNU diff and patch" (ISBN 0-9541617-5-0).

A patch is a set of differences between two files or two sets of files, usually an original and a modified version of the original. The purpose of a patch is to allow others to modify the original file in the same way that you have modified it. Thus, anyone can apply the patch to their copy of the original, and will then have a file or set of files that is identical to your modified version.

Patches are the standard way for free software developers to share bug fixes and enhancements with other developers.

Applying Patches

Patches are applied with the patch command. To specify the name of the patch to apply, patch can be invoked with the switch --input, or -i, followed by the file name of the patch. If no patch name is supplied, patch will read data from standard input (which can be the output of another command sent through a pipe |). Patches are usually stored with the extension .diff and referred to as diff files, because they are created with the diff command.

When new versions of GNU packages are released, it is customary for the maintainer to provide both a tar file of the new version and a patch file that can be used to update the source-code of the previous version. So, if you have the source code for GNU Bash 2.04, and want to upgrade to Bash 2.05, instead of downloading the full tar file bash-2.05.tar.gz, you can get the smaller patch bash-2.04-2.05.diff.gz from ftp.gnu.org. For example, the following commands show how to apply a patch to upgrade from Bash 2.04 to Bash 2.05, reading the patch directly from a compressed file through a pipe:

$ ls
bash-2.04-2.05.diff.gz         bash-2.04/
$ cd bash-2.04/
$ gunzip -c ../bash-2.04-2.05.diff.gz | patch -p1
patching file CHANGES
patching file COMPAT
patching file CWRU/POSIX.NOTES
...
patching file y.tab.h

The same effect can be achieved with the -i switch to patch, after uncompressing the patch file:

$ gunzip bash-2.04-2.05.diff.gz
$ cd bash-2.04/
$ patch -p1 -i ../bash-2.04-2.05.diff
patching file CHANGES
patching file COMPAT
patching file CWRU/POSIX.NOTES
...
patching file y.tab.h

251 lines of "patching file XXX" messages are displayed. When the command completes, your copy of bash will be identical to version 2.05. At the end of the next section, I'll show a command that you can use to verify this. After applying the patch, you'll probably want to rename the directory from bash-2.04 to bash-2.05 to reflect the changes.

The -p1 switch given to patch indicates that one directory should be stripped from the start of each filename in the patch. The filenames in the patch look like this,

$ more bash-2.04-2.05.diff
...
*** bash-2.04/CHANGES   Tue Mar 14 11:40:08 2000
--- bash-2.05/CHANGES   Tue Apr  3 10:33:50 2001
...

Since we started inside the directory bash-2.04 we need to strip off one directory from these paths to give the correct relative location of the file CHANGES in the current directory.

The need for this switch depends on how the patch was generated. Most patches need -p1, but some don't, so if you get warnings or errors when applying a patch, try applying it without the -p1 switch, or check the README file in the source distribution to see if there is a mention of -p. If you're unsure of the correct usage be sure to keep a backup of any important files you are working on.

If a patch is applied with an inappropriate -p option parts of it might be rejected, leaving you with a broken source-tree. You can use the --dry-run option to make sure that the changes apply without errors before modifying any of the files. Here, we try a dry run on the file bash-2.04-2.05.diff:

$ patch --dry-run -p1 -i ../bash-2.04-2.05.diff
patching file CHANGES
patching file COMPAT
patching file CWRU/POSIX.NOTES
...
patching file y.tab.h

With the --dry-run option none of the files are actually modified. The messages show that the patch can be applied without problems. You can run the patch command without the --dry-run option, and rest assured that all will go smoothly.

Making Patches

The next step is to make your own patches. This is incidental knowledge that developers pick up as they gain experience, so it's assumed that everyone knows how to do it without being told.

Patches are made with the diff command. If you make a copy of an existing file and edit it slightly, you can see how diff works. For example, I've made a pointless one-line change in mailcheck.c in the top-level of bash-2.05, and saved my modified version as mymailcheck.c:

$ cd ~/src/bash-2.05
$ cp mailcheck.c mymailcheck.c
$ emacs mymailcheck.c
  (to edit the new file)
$ diff mailcheck.c mymailcheck.c > mychange.diff
$ cat mychange.diff
402c402
<     message = "You have new mail in $_";
---
>     message = "New mail found in $_";

The resulting output beginning with 402c402... means: go to line 402 and take out the original line shown with <, and replace it by the new version shown with >. I can then send the file mychange.diff to other people and they can decide whether to make the same changes or not.

The simple diff output shown above has a number of limitations which prevent it being used as a patch file in practice. The first problem is that it refers to a specific line number in the file. If other people are modifying their source code, or applying other patches, it's likely that the line numbers will have changed and the patch will fail to apply.

Another problem is that the diff output does not mention the name of the file, making it difficult to provide a patch that modifies more than one file. In fact, attempting to use the simple diff output as a patch will give an error message can't find file to patch at input line 1 from the patch command.

The correct way to make a working patch is by creating a context diff, which uses an output format that includes the name of each modified file and a few lines of context around the modified lines. The additional lines allow the patch to locate the changed line when other parts of the file have been added or deleted. Context diffs are made by adding the -c switch to the diff command:

$ cd ~/src/bash-2.05
$ diff -c mailcheck.c mymailcheck.c > mypatch.diff

Here is what a context diff looks like,

$ cat mypatch.diff
*** mailcheck.c 2003-12-18 16:09:55.000000000 +0000
--- mymailcheck.c       2003-12-18 16:10:07.000000000 +0000
***************
*** 399,405 ****
    /* If the mod time is later than the access time and the file
       has grown, note the fact that this is *new* mail. */
    if (use_user_notification == 0 && (atime < mtime) && file_is_bigger)
!     message = "You have new mail in $_";
  #undef atime
  #undef mtime

--- 399,405 ----
    /* If the mod time is later than the access time and the file
       has grown, note the fact that this is *new* mail. */
    if (use_user_notification == 0 && (atime < mtime) && file_is_bigger)
!     message = "New mail found in $_";
  #undef atime
  #undef mtime

The patch made from a context diff is larger, and the additional lines of context give the patch command a way to find the location of modified lines if their position in a file has changed. The lines of context also make it much simpler for the recipient to understand the impact of the changes. These two reasons combined mean that patches should always be distributed in context format.

Furthermore, in the context format the name of the file we're modifying is in the patch (on the first line), so the patch command will automatically know which file to update.

$ patch -i mypatch.diff
patching file mailcheck.c

These examples have dealt with single files but most patches will need to alter multiple files, so we need to be able to make a patch containing all the differences between two directories. This is done by passing the --recursive, or -r switch to diff. So, if I was making some changes to Bash 2.05, I'd start by making a copy of the directory, and make my changes in the new directory so that I still have a pristine copy of Bash 2.05 that I can use to generate my patch.

$ cd ~/src/
$ cp -a bash-2.05 bash-2.05-mychanges

(then alter the necessary files)

$ diff -cr bash-2.05 bash-2.05-mychanges > mychanges.diff

I will now have a context diff of all modifications I've made to Bash 2.05, and I can make mychanges.diff available for others to use.

There are several additional options to the diff command which are useful when preparing patches for multiple files: these include the ability to exclude certain directories or files, and the handling of files which have been added or deleted.

Before distributing a patch to other people you should always test it first. You can use the diff command to see if a patch has worked as expected. After applying mychanges.diff to a copy of your bash-2.05 directory, you can use diff -r to compare the newly patched version with your modified version.

Say you want to verify that mychanges.diff applied to bash-2.05 definitely gives you the same tree as bash-2.05-mychanges. You can test this by making a new copy of the source code and applying the patch to it:

$ cp -a bash-2.05 bash-2.05-test
$ cd bash-2.05-test
$ patch -p1 -i mychanges.diff
patching file mailcheck.c
(and other modified files . . .)
$ cd ..
$ diff -cr bash-2.05-mychanges bash-2.05-test

If all went well with the patch command, there shouldn't be any differences and the final diff command will not output anything. If you're not interested in details about the differences, just whether or not the directories are different, add the -q or --brief option.

Further Info

You should now have a basic understanding of the principles of making and applying patches. There are plenty of other options to the diff and patch commands that are outside the scope of this article -- these include options for ignoring space, tabs, and blank-lines, merging files with precprocessor conditionals, controlling patch "fuzz-factors" and selecting subsets differences using regular-expression matching.

There are also several related command-line tools such as cmp, sdiff, and diff3 which can be used for more advanced file-comparison. The diff3 command allows a 3-way comparison of files sharing a common-ancestor -- this is useful in the situation where two people have independently modified a single file, and their changes must be reconciled.

More information about these options and commands can be found in the GNU manual "Comparing and Merging Files with GNU diff and patch" (ISBN 0-9541617-5-0). Printed copies are available from Network Theory Ltd.

This article is available under the GNU Free Documentation License: source code (.tar.gz)