- publishing free software manuals
Comparing and Merging Files with GNU diff and patch
by David MacKenzie, Paul Eggert, and Richard Stallman
Paperback (6"x9"), 120 pages
ISBN 0954161750
RRP £12.95 ($19.95)

"Well packaged... the quality of information is excellent" --- Linux User and Developer Magazine (Issue 36, Feb 2004) Get a printed copy>>>

1.1 Hunks

When comparing two files, diff finds sequences of lines common to both files, interspersed with groups of differing lines called hunks. Comparing two identical files yields one sequence of common lines and no hunks, because no lines differ. Comparing two entirely different files yields no common lines and one large hunk that contains all lines of both files. In general, there are many ways to match up lines between two given files. diff tries to minimize the total hunk size by finding large sequences of common lines interspersed with small hunks of differing lines.

For example, suppose the file ‘F’ contains the three lines ‘a’, ‘b’, ‘c’, and the file ‘G’ contains the same three lines in reverse order ‘c’, ‘b’, ‘a’. If diff finds the line ‘c’ as common, then the command ‘diff F G’ produces this output:

< a
< b
> b
> a

But if diff notices the common line ‘b’ instead, it produces this output:

< a
> c
< c
> a

It is also possible to find ‘a’ as the common line. diff does not always find an optimal matching between the files; it takes shortcuts to run faster. But its output is usually close to the shortest possible. You can adjust this tradeoff with the --minimal option (see section 6 diff Performance Tradeoffs).

ISBN 0954161750Comparing and Merging Files with GNU diff and patchSee the print edition