Articles > Making and Applying Patches - A short introduction to GNU diff and patch
Ciaran O'Riordan, Stephen Compall, Brian Gough (Network Theory Ltd)Introduction
The purpose of this article is to provide a tutorial introduction
to making and using patches with the GNU commands diff
and patch.
For more information see the printed manual "Comparing and Merging Files with GNU diff and patch" (ISBN 0-9541617-5-0).
A patch is a set of differences between two files or two sets of files, usually an original and a modified version of the original. The purpose of a patch is to allow others to modify the original file in the same way that you have modified it. Thus, anyone can apply the patch to their copy of the original, and will then have a file or set of files that is identical to your modified version.
Patches are the standard way for free software developers to share bug fixes and enhancements with other developers.
Applying Patches
Patches are applied with the patch command. To specify the
name of the patch to apply, patch can be invoked with the
switch --input, or -i, followed by the file name of
the patch. If no patch name is supplied, patch will read data
from standard input (which can be the output of another command sent
through a pipe |). Patches are usually stored with the extension
.diff and referred to as diff files, because they are
created with the diff command.
When new versions of GNU packages are released, it is customary for the
maintainer to provide both a tar file of the new version and a patch
file that can be used to update the source-code of the previous version.
So, if you have the source code for GNU Bash 2.04, and want to
upgrade to Bash 2.05, instead of downloading the full tar file
bash-2.05.tar.gz, you can get the smaller patch
bash-2.04-2.05.diff.gz from ftp.gnu.org. For example, the
following commands show how to apply a patch to upgrade from Bash 2.04
to Bash 2.05, reading the patch directly from a compressed file through
a pipe:
$ ls bash-2.04-2.05.diff.gz bash-2.04/ $ cd bash-2.04/ $ gunzip -c ../bash-2.04-2.05.diff.gz | patch -p1 patching file CHANGES patching file COMPAT patching file CWRU/POSIX.NOTES ... patching file y.tab.h
The same effect can be achieved with the -i switch to patch,
after uncompressing the patch file:
$ gunzip bash-2.04-2.05.diff.gz $ cd bash-2.04/ $ patch -p1 -i ../bash-2.04-2.05.diff patching file CHANGES patching file COMPAT patching file CWRU/POSIX.NOTES ... patching file y.tab.h
251 lines of "patching file XXX" messages are displayed. When the command
completes, your copy of bash will be identical to version 2.05. At
the end of the next section, I'll show a command that you can use to
verify this. After applying the patch, you'll probably want to rename
the directory from bash-2.04 to bash-2.05 to reflect the
changes.
The -p1 switch given to patch indicates that one directory
should be stripped from the start of each filename in the patch. The
filenames in the patch look like this,
$ more bash-2.04-2.05.diff ... *** bash-2.04/CHANGES Tue Mar 14 11:40:08 2000 --- bash-2.05/CHANGES Tue Apr 3 10:33:50 2001 ...
Since we started inside the directory bash-2.04 we need to strip
off one directory from these paths to give the correct relative location of
the file CHANGES in the current directory.
The need for this switch depends on how the patch was generated. Most
patches need -p1, but some don't, so if you get warnings or
errors when applying a patch, try applying it without the -p1
switch, or check the README file in the source distribution to
see if there is a mention of -p. If you're unsure of the
correct usage be sure to keep a backup of any important files you are
working on.
If a patch is applied with an inappropriate -p option parts of
it might be rejected, leaving you with a broken source-tree. You can
use the --dry-run option to make sure that the changes apply
without errors before modifying any of the files. Here, we try a dry
run on the file bash-2.04-2.05.diff:
$ patch --dry-run -p1 -i ../bash-2.04-2.05.diff patching file CHANGES patching file COMPAT patching file CWRU/POSIX.NOTES ... patching file y.tab.h
With the --dry-run option none of the files are actually
modified. The messages show that the patch can be applied without
problems. You can run the patch command without the --dry-run
option, and rest assured that all will go smoothly.
Making Patches
The next step is to make your own patches. This is incidental knowledge that developers pick up as they gain experience, so it's assumed that everyone knows how to do it without being told.
Patches are made with the diff command. If you make a copy of
an existing file and edit it slightly, you can see how diff
works. For example, I've made a pointless one-line change in
mailcheck.c in the top-level of bash-2.05, and saved my modified
version as mymailcheck.c:
$ cd ~/src/bash-2.05 $ cp mailcheck.c mymailcheck.c $ emacs mymailcheck.c (to edit the new file) $ diff mailcheck.c mymailcheck.c > mychange.diff $ cat mychange.diff 402c402 < message = "You have new mail in $_"; --- > message = "New mail found in $_";
The resulting output beginning with 402c402... means: go to
line 402 and take out the original line shown with <, and replace
it by the new version shown with >. I can then send the file
mychange.diff to other people and they can decide whether to make
the same changes or not.
The simple diff output shown above has a number of limitations
which prevent it being used as a patch file in practice. The first
problem is that it refers to a specific line number in the file. If
other people are modifying their source code, or applying other patches,
it's likely that the line numbers will have changed and the patch will
fail to apply.
Another problem is that the diff output does not mention the
name of the file, making it difficult to provide a patch that modifies
more than one file. In fact, attempting to use the simple diff output
as a patch will give an error message can't find file to patch at
input line 1 from the patch command.
The correct way to make a working patch is by creating a context
diff, which uses an output format that includes the name of each
modified file and a few lines of context around the modified lines. The
additional lines allow the patch to locate the changed line
when other parts of the file have been added or deleted. Context diffs
are made by adding the -c switch to the diff command:
$ cd ~/src/bash-2.05 $ diff -c mailcheck.c mymailcheck.c > mypatch.diff
Here is what a context diff looks like,
$ cat mypatch.diff
*** mailcheck.c 2003-12-18 16:09:55.000000000 +0000
--- mymailcheck.c 2003-12-18 16:10:07.000000000 +0000
***************
*** 399,405 ****
/* If the mod time is later than the access time and the file
has grown, note the fact that this is *new* mail. */
if (use_user_notification == 0 && (atime < mtime) && file_is_bigger)
! message = "You have new mail in $_";
#undef atime
#undef mtime
--- 399,405 ----
/* If the mod time is later than the access time and the file
has grown, note the fact that this is *new* mail. */
if (use_user_notification == 0 && (atime < mtime) && file_is_bigger)
! message = "New mail found in $_";
#undef atime
#undef mtime
The patch made from a context diff is larger, and the additional lines
of context give the patch command a way to find the location
of modified lines if their position in a file has changed. The lines of
context also make it much simpler for the recipient to understand the
impact of the changes. These two reasons combined mean that patches
should always be distributed in context format.
Furthermore, in the context format the name of the file we're modifying
is in the patch (on the first line), so the patch command will
automatically know which file to update.
$ patch -i mypatch.diff patching file mailcheck.c
These examples have dealt with single files but most patches will need
to alter multiple files, so we need to be able to make a patch
containing all the differences between two directories. This is done
by passing the --recursive, or -r switch to
diff. So, if I was making some changes to Bash 2.05, I'd
start by making a copy of the directory, and make my changes in the
new directory so that I still have a pristine copy of Bash 2.05 that
I can use to generate my patch.
$ cd ~/src/ $ cp -a bash-2.05 bash-2.05-mychanges
(then alter the necessary files)
$ diff -cr bash-2.05 bash-2.05-mychanges > mychanges.diff
I will now have a context diff of all modifications I've made to Bash
2.05, and I can make mychanges.diff available for others to use.
There are several additional options to the diff command which
are useful when preparing patches for multiple files: these include the
ability to exclude certain directories or files, and the handling of
files which have been added or deleted.
Before distributing a patch to other people you should always test it
first. You can use the diff command to see if a patch has
worked as expected. After applying mychanges.diff to a copy of
your bash-2.05 directory, you can use diff -r to
compare the newly patched version with your modified version.
Say you want to verify that mychanges.diff applied to
bash-2.05 definitely gives you the same tree as
bash-2.05-mychanges. You can test this by making a new copy of
the source code and applying the patch to it:
$ cp -a bash-2.05 bash-2.05-test $ cd bash-2.05-test $ patch -p1 -i mychanges.diff patching file mailcheck.c (and other modified files . . .) $ cd .. $ diff -cr bash-2.05-mychanges bash-2.05-test
If all went well with the patch command, there shouldn't be
any differences and the final diff command will not output
anything. If you're not interested in details about the differences,
just whether or not the directories are different, add the -q
or --brief option.
Further Info
You should now have a basic understanding of the principles of making
and applying patches. There are plenty of other options to the
diff and patch commands that are outside the scope
of this article -- these include options for ignoring space, tabs, and
blank-lines, merging files with precprocessor conditionals, controlling
patch "fuzz-factors" and selecting subsets differences using
regular-expression matching.
There are also several related command-line tools such as cmp,
sdiff, and diff3 which can be used for more advanced
file-comparison. The diff3 command allows a 3-way comparison
of files sharing a common-ancestor -- this is useful in the situation
where two people have independently modified a single file, and their
changes must be reconciled.
More information about these options and commands can be found in the GNU manual "Comparing and Merging Files with GNU diff and patch" (ISBN 0-9541617-5-0). Printed copies are available from Network Theory Ltd.
This article is available under the GNU Free Documentation License: source code (.tar.gz)