Re: diff mishandling symlinks



On Tue, 22 May 2007 00:54:03 +0200 Stefaan A Eeckels <hoendech@xxxxxx> wrote:
| On 21 May 2007 12:50:09 GMT
| phil-news-nospam@xxxxxxxx wrote:
|
|> On Mon, 21 May 2007 00:55:42 +0200 Stefaan A Eeckels
|> <hoendech@xxxxxx> wrote:
|>
|> | Your understanding of symbolic links is incorrect. Unless
|> | specifically requested, references to a symbolic link return the
|> | data and metadata of the file it points to. A command like "ls"
|> | distinguishes between symbolic and hard links because it is, after
|> | all, a directory lister.
|>
|> I don't think so. I believe I understand symlinks exactly.
|
| Of course you would :)
|
|> BTW, a symlink reference to a non-existant file, that may, or may not,
|> come to exist later on, is a valid symlink.
|
| Not germane to the case. We're not talking about whether symbolic links
| are valid or not, but if one can derive useful information from
| comparing the metadata of the symbolic links.

The metadata of symlinks is not an issue I brought up. Who thinks I
want to get things like the data a symlink was created. My interest
is in the path it references.


| The OS doesn't allow you to manipulate the metadata of the symbolic
| link to the same extent as that of other files. As I did show,
| duplicating a directory that contains symbolic links changes the
| metadata of the link because the OS doesn't allow you to set the time
| fields as you can with other files. Hence, the only valid metadata that
| can usefully be compared is that of the files the symbolic links point
| to.

Again, why are you focusing on time fields? WHat about the link
reference string?


|> Symlinks also can form a valid file selection strategy by pointing to
|> places where other files or directories may, or may not exist. In the
|> case of directories, they can contain more symlinks, create a complex
|> strategy that allows someone who did not create the original tree of
|> files to easily insert variations on the strategy without modifying
|> the original pristine tree.
|
| Granted, but that does not mean that the metadata of the symbolic links
| (whether pointing to existing or non-existing files) conveys similar
| information to that of other files. The size of the file is relevant as
| it indicates the length of the reference. Still, different size
| references can point to the same file. The mtime reflects the creation
| time of the symbolic link, but unlike other files, the contents of the
| link cannot be modified (that is, you cannot open() the symbolic link
| file and modify the pointer, as all open() calls concern the target of
| the link).

You can replace the link with one that has the changed reference. This
can even be done atomically by first creating temporary symlink with a
different name, and calling rename(2) to replace the target symlink with
the new one.

It will have the current date, but so what.


|> Symlinks can show relationships between files. For example, consider
|> a program that when compiled would have alternative command names as
|> symlinks in the resultant installed binaries. This can be coded as
|> a seris of steps to install it in a Makefile. But with a dynamic
|> Makefile generator, that information has to be stored elsewhere in
|> some other form. Storing it as symlinks to source files is one such
|> way that makes sense and can work (it does work, in fact, when using
|> a Makefile generator that is designed to understand it). Fo example
|> if the desired results are:
|>
|> /usr/local/bin/bar (regular file with an executable binary)
|> /usr/local/bin/foo -> b (alternate way to execute same binary)
|>
|> This could be "defined" in the source file space like:
|>
|> bar.c (regular file with the program main function)
|> foo.c -> bar.c (alternate command name for same program)
|>
|> The semantics would be different than the semantics of having two
|> regular files named bar.c and foo.c with identical contents.
|
| But this could equally be implemented with a hard link - as a matter of
| fact, some commands are (or were) implemented to use the name by which
| they were invoked to present different functionality (mv and cp come to
| mind). In this case the metadata of the symbolic link (which in any
| case is restricted to the date it was created and the user) do not
| contribute to the functionality you describe. In almost all cases the
| use of one or other type of link is determined by whim or the fact that
| one has to cross file systems, or create links to directories (things
| that cannot be done with hard links).

Show me how to make a dangling hardlink that refers to a place where
an optional file may, or may not, exist ... across different filesystems.

And you're still stuck on the symlink date.


|> | "touch" OTOH doesn't have an option to make it affect the symbolic
|> | link instead of the file pointed to, so there is no easy way to
|> | change the metadata of the symbolic link itself. In the same vein,
|> | you cannot use "chmod" to change the privileges of the symbolic
|> | link itself (i.e. you cannot create a link that either restricts or
|> | augments the access rights of the file it points to). The same
|> | applies to access control lists.
|>
|> I haven't needed to do that. I am aware of that limitation., and I do
|> think it could potentially be a logistic problem, but to date I have
|> not encountered any such problem.
|
| The point is that on a Unix system you _cannot_ do this, due to the way
| symbolic links are implemented.

Cannot do what? I'm not the one asking to "touch" a symlink to bump it
to the current date (but that can be faked) or some other date (cannot
be faked w/o internal f/s access).


|> | but there is no way in which tar can recreate the "same" symbolic
|> | link (at least not on Solaris and Linux).
|>
|> So. It's simply as if a symlink has no time metadata, other than as
|> residule metadata usable perhaps for auditing purposes.
|
| It has time metadata, but unless you would change the system time
| before (and set it back after) the creation of the link, you cannot
| modify it as you can with other types of files. Neither can you modify
| the permission fields, or assign acls - doing so changes the
| permissions and acls on the file the symbolic link points to.

I'm not wanting to modify the date of a symlink.


|> | A symbolic link behaves to all intents and purposes and as far as
|> | possible, the same as a hard link to a file.
|>
|> No it doesn't. If I swap names on the target file and some other, the
|> symlink still refers to whatever has the same name (now a different
|> file) A hard link would have maintained the association with the same
|> object itself (which after such a swap would have a different other
|> name).
|
| That is what I meant by "as far as possible". The objective of the
| symbolic link is to point to a file _by name_, so you describe intended
| behaviour. It also is a separate file (and hence an inode) and not
| merely a directory entry like a hard link. But the symbolic link, just
| as a hard link, gives you nothing more than another name to access a
| file. Neither type of link adds metadata to the original file. And
| because the metadata of the symbolic link cannot be changed, it cannot
| convey the same information as other types of files.

It does give you a referencer/referee relationship (a hardlink does not).
You can rename(2) other files to and from the referenced name without
affecting the reference being to whatever is at that name. A symlink
does have different semantics than a hardlink. Each has its uses.


|> | You will notice (if you care to try) that "diff" equally doesn't
|> | show a difference if you replace a symbolic link by a hard link. It
|> | deals with content only.
|>
|> I did notice. It clearly is not a tool that goes beyond what content
|> of regular files is. It effectively loses some metadata. It's
|> usability therefore is limited to only a set of regular files.
|
| Which is what it was designed for. You could have it manipulate
| metadata, but not that of symbolic links because that is not supported
| by the OS. So your original premise - that diff/patch are "mishandling"
| symbolic links - is incorrect. What is the case is that you expect too
| much from Unix symbolic links.

I know what I am getting from symlinks and I do get what I need.
Maybe I am expecting too much from diff and patch.


|> | As to patch - given that the underlying OS makes it impossible to
|> | manipulate the symbolic link metadata, there is no reason patch (or
|> | tar, for that matter) should attempt to do so.
|>
|> You can change what the symlink points to. That may be implemented
|> as an unlink and re-create. But it is doable and works.
|
| You can create and delete symbolic links. You can (through the -h
| option on Solaris at least) change the owner of the symbolic link. You
| cannot change its contents (deleting and creating another symbolic link
| doesn't qualify for "change what the symlink points to" as it is a
| different file). They are not regular files.

Creating a new symlink accomplishes the goal of changing where a symlink
points to. If that (change where it points to) is what needs to be done,
it can be done. It might not be the most elegant way but it can be done.
And with rename(2) an atomic change can also be accomplished where that
might be needed.


| In short, comparing metadata of symbolic links in different directories
| will only confirm that they are different directories.

So.


|> BTW, you can also create hardlinks between symlinks. Ever tried that?
|
| A symbolic link is an inode (with special contents). You can create
| hard links to inodes, simply because the original name is also a hard
| link. This is the way the Unix file system works.

Symlinks can point to directories, too. Try that with a hardlink (hint:
there are ways on some systems, but it tends to make a mess of things).

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2007-05-21-1927@xxxxxxxx |
|------------------------------------/-------------------------------------|
.



Relevant Pages

  • Re: Damn you, FEDEX! or Nikon D40 lost in Springfield, MO blackhole.
    ... the 2 mp Mavica he had been using with a Nikon D40. ... After shopping around, he got me to order one for him. ... The shipper had it insured, but from what I have read it could take weeks to sort this crap out. ... You may get your insurance from FedEx and a couple weeks later they find it and deliver it. ...
    (alt.photography)
  • Re: diff mishandling symlinks
    ... I believe I understand symlinks exactly. ... | "touch" OTOH doesn't have an option to make it affect the symbolic link ... | metadata of the symbolic link itself. ... | The contents of the tar file _do_ reflect the creation time of the ...
    (comp.unix.programmer)
  • Re: diff mishandling symlinks
    ... I believe I understand symlinks exactly. ... comparing the metadata of the symbolic links. ... time of the symbolic link, but unlike other files, the contents of the ... the same as a hard link to a file. ...
    (comp.unix.programmer)
  • Re: [OpenVMS Alpha V8.3] SET FILE/SYMLINK ?
    ... (Peter 'EPLAN' LANGSTOeGER) ... feature. ... A symlink does not have to point to a file, like a logical name, it can contain ... And while the container for a symbolic link may be a file, ...
    (comp.os.vms)
  • Re: has this bug been fixed?
    ... > option to not deference it? ... So it would seem that the symlink is always resolved if it ... A symbolic link to a directory behaves differently than you ... This means that the parent of the new working directory is ...
    (comp.sys.hp.hpux)