Re: Request for Comments: libarchive, bsdtar

From: Tim Kientzle (kientzle_at_acm.org)
Date: 01/15/04

  • Next message: Vladimir Dozen: "Re: Request for Comments: libarchive, bsdtar"
    Date: Wed, 14 Jan 2004 17:23:41 -0800
    To: Tim Robbins <tjr@freebsd.org>
    
    

    Tim Robbins wrote:
    > On Tue, Jan 13, 2004 at 09:31:49PM -0800, Tim Kientzle wrote:
    >
    >>Request for Comments: libarchive, bsdtar
    >>
    >>Add "libarchive" to the tree, prepare to change the system
    >>tar command to "bsdtar" once it is sufficiently stable.
    >
    > [...]
    >
    > Let me start by thanking you for working on replacing GNU utilities with
    > higher quality and less restrictively licensed alternatives. I haven't
    > had time to read over the code very thoroughly, but I have a few initial
    > comments:

    Thanks for the feedback. A lot of people rely on 'tar',
    so I want to make sure it's well-tested and does what
    people really need before it becomes the default. When
    you do have time to look over the code, please let me
    know what you think.

    > - Padding gzip'd tar archives (with bsdtar czf) causes gzip to report
    > "trailing garbage" and fail, and in turn this causes GNU tar to fail.

    Oddly, GNU tar does successfully and correctly extract the archive,
    and then exits with an error code. There's an easy one-line
    patch that fixes this bug in GNU tar, by the way. ;-)

    > BSD pax (-wzf) and GNU tar (czf) do not pad compressed archives.

    The issue here is correct blocking for devices that require it.
    (E.g., tape drives, floppies) libarchive correctly blocks all
    output, regardless of whether or not it is compressed. Neither
    GNU tar nor BSD pax gaurantee this.

    It goes a bit deeper in the case of libarchive. By design,
    libarchive knows nothing about the archive storage. This means
    there is no simple way for it to vary it's operation depending
    on whether it's writing to a file or character device, unlike
    monolithic programs such as GNU tar or BSD pax.

    I have some ideas about how to change this by generalizing the
    blocking calculations within libarchive and providing some
    client hooks for finer control over the blocking, but I haven't
    decided whether or not it's worth the effort.

    Somehow, though, I doubt you'll be the last person to complain about
    this ;-), so I'll start looking for a good way to change this
    behavior.

    > - I would prefer it if compression was done by opening a pipe to gzip/bzip2
    > instead of using libz/libbz2. This would make things simpler, and make it
    > easier to support compress(1).

    Not really simpler for the library, and definitely not simpler
    for clients of the library.

    This is related to the blocking issue I mentioned just above.
    In order to correctly block the output, you need to collect the
    output of the compression program and reblock it. An early version
    of libarchive did exactly this, forking a three-stage pipeline with
    the compression/decompression program in the middle. Unfortunately,
    this created some odd problems, as the archive I/O then occurred
    in a separate process from the rest of the program. For example,
    this made it difficult for clients to monitor the I/O status
    from their mainline code, and hampered proper error reporting.

    It also seemed inappropriate for a library to be invoking
    client-provided callbacks in a different process.

    However, each compression type is handled in a cleanly-factored
    code module, and I do still have the code in my personal CVS repo to
    fork out the pipeline. I could resurrect this to fork compress(1)
    if there's real demand.

    > - I don't think the URL/libfetch support belongs in a library that deals
    > with archives. Perhaps the interface could be changed so that the
    > caller could pass a FILE * or a file descriptor instead of a filename.

    The libfetch tie-in (archive_read_open_url) is provided purely
    for the convenience of simple clients. If you don't like it,
    don't use it. It is completely optional. Generally, I've gone
    to a great deal of effort to minimize link pollution. For
    example, if you don't call the functions that handle gzip/bzip2
    compression, they won't be linked in and neither will libz/libbz2.
    Similar comments apply to the various format support functions.
    I've even carefully separated archive reading and writing
    in case you only want to use one of them.

    As for I/O interfaces, the core archive_read_open and
    archive_write_open functions accept a collection of function
    pointers that the library will invoke for open/read/write/close
    operations on the archive. This is considerably more
    flexible than FILE * or file descriptors.

    Not to mention that passing file descriptors has
    some tricky implications if the library forks to run
    archive I/O in a separate process. FILE * is simply
    a bad idea because the stdio interface doesn't provide client
    control over blocking. (Yes, the libfetch convenience
    hooks do use FILE *, but blocking is unimportant
    for sockets, so that's okay.)

    > - Filenames are too long :-)

    Take a typing class. ;-)

    _______________________________________________
    freebsd-arch@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-arch
    To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"


  • Next message: Vladimir Dozen: "Re: Request for Comments: libarchive, bsdtar"

    Relevant Pages

    • Re: tar -cvfX save.tar foo ./dirtosave/..
      ... So star produces incompatible archives. ... Is it POSIX, or is it GNU? ... the POSIX standards in the way you phrase a command, for example, GNU ...
      (comp.os.linux.misc)
    • Re: tar -cvfX save.tar foo ./dirtosave/..
      ... So star produces incompatible archives. ... Is it POSIX, or is it GNU? ... the POSIX standards in the way you phrase a command, for example, GNU ...
      (comp.unix.solaris)
    • tar 1.15 released (fwd)
      ... I am pleased to announce the release of GNU tar 1.15. ... GNU tar is an archiver that creates and handles file archives in various ... ** Correctly recognize sparse archive members when used with -T option. ...
      (comp.os.linux.announce)
    • Re: Request for Comments: libarchive, bsdtar
      ... and in turn this causes GNU tar to fail. ... BSD pax and GNU tar do not pad compressed archives. ... caller could pass a FILE * or a file descriptor instead of a filename. ...
      (freebsd-arch)