Re: file and record formats

From: Bill Gunshannon (bill_at_cs.uofs.edu)
Date: 01/14/04


Date: 14 Jan 2004 02:57:23 GMT

In article <EkUjtH96u9K+@eisner.encompasserve.org>,
        briggs@encompasserve.org writes:
> In article <bu1lvv$cia25$1@ID-135708.news.uni-berlin.de>, bill@gw5.cs.uofs.edu (Bill Gunshannon) writes:
>> In article <9ZNzQcf2SLnu@eisner.encompasserve.org>,
>> briggs@encompasserve.org writes:
>>> If your Unix diff utility is looking at raw file data then your Unix
>>> diff utility is broken.
>>
>> No more than your hammer is broken if it doesn't properly put in screws.
>> All files in Unix are raw. diff is just the wrong tool if you are trying
>> to compare none Unix Text Files.
>
> No. His diff utility is broken. It's treating a VMS file as a stream of
> bytes. VMS files are not streams of bytes. They are streams of records.

I agree, his diff is broken but that has nothing to do with the Unix diff
command which was designed to compare records in a text file in the format
Unix uses for text files, that means lines of ASCII text termnated with
the newline character. Anything else is not a text file and not food for
the diff command.

>
> The Unix diff utility is an interesting example of a record oriented
> utility acting on a byte stream data source.

All Unix files are byte streams. The only thing that makes a file a
text file is the character set it contains and (assuming it is supposed
to be record oriented) the presence of newline characters.

>
> On VMS, a properly written diff utility should look for record boundaries,
> not newline characters. But if you're going for a simple port, using
> the C RTL and examining the virtual data stream for newlines is a
> reasonable way to proceed. Examining the raw on-disk data for newlines
> is flat wrong.

True also, but that isn't Unix's fault. Using diff to try and compare
binary (non-ASCII) files on a Unix box will also provide some rather
strange results, especially depending on the terminal your using. :-)
 
bill

-- 
Bill Gunshannon          |  de-moc-ra-cy (di mok' ra see) n.  Three wolves
bill@cs.scranton.edu     |  and a sheep voting on what's for dinner.
University of Scranton   |
Scranton, Pennsylvania   |         #include <std.disclaimer.h>   


Relevant Pages

  • Re: file and record formats
    ... For example, the UNIX diff ... >> command expects either CRLF or CR but reports completely inaccurate results ... is the ASCII Line Feed character. ...
    (comp.os.vms)
  • Re: Help A newbie in UNIX
    ... > I am newbie in unix environment.I have worked in C in Windows but now I ... >I am calling the system function to perform a diff of two text files..how ... It returns a type int, and takes one parameter, ... So what you want to know, is what should the string be, and what does ...
    (comp.unix.programmer)
  • Need binary comparision tool and binary parsers
    ... I am looking for any binary comparision freeware tool on unix which ... would give diff in terms of function call and symbol tables. ... Is there any tool for binary comparision too ?. ... On unix and free ware tool ?. ...
    (comp.compilers)
  • Re: diff in Perl
    ... >> I need to compare to docs and output something similar to diff in unix. ... A web search for GNU tools win32 will turn up several versions compiled to ...
    (comp.lang.perl.misc)
  • Re: Invoking diff from java with piped input
    ... Any process has only one standard input stream. ... This is described in the diff documentation. ... then then write to it from your Java program. ... external helper to create the named pipe however. ...
    (comp.lang.java.programmer)