Re: parse unix-style difference reporting
From: Jonathan Leffler (jleffler_at_earthlink.net)
Date: 12/30/03
- Next message: Thor Kottelin: "Re: Not all HTML mail is undesirable"
- Previous message: William Park: "Re: Not all HTML mail is undesirable"
- In reply to: Liang: "parse unix-style difference reporting"
- Next in thread: Thomas Dickey: "Re: parse unix-style difference reporting"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 30 Dec 2003 07:53:19 GMT
Liang wrote:
> I want to diff two files or two versions of one file, and parse the output
> to find a summary of how many lines of replacement/addition/deletion in the
> two files.
>
> Known from diff/cleardiff, the output has a style like:
> 15a16, 15,17d3, 18c19,21 etc.
>
> Anyone know how to parse these output to generate a summary?
It isn't very hard to work it out, is it?
Each item conceptually has four numbers and an operation code:
N1,N2 op N3,N4
When there is just one number on one side of the operation, the values
N1 and N2, or N3 and N4, are the same.
Inserts are easy: there's always a single number on the LHS, and the
number of lines inserted is N4-N3+1.
Similarly, deletes are easy: there's always a single number on the RHS
of the operator, and the number of lines deleted is N2-N1+1.
Number of lines replaced has two parts to the value - the number of
lines removed and the number replacing the removed lines. Depending
on your viewpoint, you can either choose to count the two values
separately (number removed NR = N2-N1+1, number inserted NI =
N4-N3+1), or you can be cleverer about the calculation and decide that
when NR > NI, then you have NI changed lines and NR-NI deleted lines,
and that when NR < NI, you have NR changed lines and NI-NR inserted
lines. When NR = NI, you have NR (or NI) changed lines, of course.
That took me five minutes to think and type - how long would it have
taken you to do it? (And cross-posted too?)
-- Jonathan Leffler #include <disclaimer.h> Email: jleffler@earthlink.net, jleffler@us.ibm.com Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/
- Next message: Thor Kottelin: "Re: Not all HTML mail is undesirable"
- Previous message: William Park: "Re: Not all HTML mail is undesirable"
- In reply to: Liang: "parse unix-style difference reporting"
- Next in thread: Thomas Dickey: "Re: parse unix-style difference reporting"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]