Re: -o logging": important? Why defaulted OFF by Sun?

From: Logan Shaw (lshaw-usenet_at_austin.rr.com)
Date: 01/29/04


Date: Thu, 29 Jan 2004 07:39:00 GMT

David Combs wrote:
> (Before I go to google) What's "fastfs"? No manpage in s9.
>
> Is fastfs something that *lots* (even most of the "knowledgeable" (sp?)
> solaris admins) use? Or at least know about?)

It's less useful these days, now that logging exists.
But it still sometimes can be useful.

Here's the scoop: filesystems need to write certain data in
certain order in order to improve odds of things being
readable if the power is lost, etc. Also, APIs guarantee
that at certain points, things will have been written to
disk, placing further restrictions on when things have to be
written.

But disks perform better if you don't have to seek around
from place to place. If you could bunch up a whole bunch
of writes and then do them in optimal order for the disk
head's purposes, you'd get much better performance. But
you would have a less resilient filesystem and you wouldn't
be making the proper guarantees for applications that want
data written to the disk.

Still, in some cases (such as when copying from disk #1
to disk #2 or when restoring from backup tape), you don't
care about corrupting the filesystem if power is lost.
You care about writing as much to disk as possible in the
shortest possible time. And what's worse, things
that affect the structure of the filesystem (not
just contents of a single file) like creating files
tend to require the disk head to seek around alot to
do things in the proper order.

So anyway, Solaris has an ioctl() call that you can do
which tells a given (ufs) filesystem to put itself in
(I believe it's called) "delayed i/o" mode, meaning it
can write stuff to disk absolutely whenever it feels
like it. There is a little program called fastfs.c
that's available that will let you make the ioctl()
call to turn this flag off and on.

When you put a filesystem in "fast" mode, the difference
is quite dramatic. If creating lots of small files, it can
go several times faster than in "slow" (normal) mode. If
you're trying to bring up a server that just had a disk
crash in the middle of the day, that can be important.
While you're using it, the disk behavior is a little
odd. Normally, if you type "sync", you'll see maybe
one or two seconds of I/O at the most happening. But
if the filesystem is in "fast" mode, it can be 30
seconds or something.

Anyway, just to complete the picture, you can't
have a filesystem in "fast" mode and have logging
turned on at the same time. And earlier I mentioned
that fastfs is no longer as useful as it once was,
back before logging existed in ufs. The reason is,
when a filesystem has logging turned on, it still must
commit things to disk in the proper order, but because
it is writing metadata information into a log and then
later flushing that log to disk (at its convenience),
the performance is better. The reason is, the log
is (hopefully) contiguous, and it can write a whole
bunch of metadata updates to the log all in one nice
fast sequential I/O, instead of having to go all over
the disk and do them. And when the log starts to
fill up, it can flush those metadata changes to the
other parts of the disk more efficiently too. Partly
because this can be done when the filesystem is idle
(if it ever is idle), partly because even if the
filesystem isn't idle, the system calls depending
on the metadata writes can return earlier (allowing
the processes that made them to block for less time
and have a better turnaround time overall), and
partly -- I think -- because the log can be flushed
in one big batch, and batch mode processing is
more efficient.

Anyway, the point to all this is that in the old
days, if you were restoring from tape or copying
a filesystem from another disk, and if you wanted
it to finish quickly, it was really worth it to
go get fastfs.c and turn that on, because it made
things go twice as fast or even faster. Today,
though, mounting with logging turned on still gets
you a performance boost (although it seems like not
quite as much of one) and instead of making the
filesystem less resilient in the case of power
failures, etc., it makes it MORE resilient.

Still, if I were doing a big ufsrestore onto a blank
250 GB filesystem, I'd use fastfs. Logging is
easier, but I think fastfs would be faster, and
if the power is lost in the middle of the ufsrestore,
I'm going to have to start over either way anyway. :-)

   - Logan



Relevant Pages

  • Re: Why is KDE 4 so messed up?
    ... Looking with YAST for a disk editor not a single one was shown. ... animal for a modern filesystem running on a POSIX based system. ... but none provide that type of access to a user program. ...
    (alt.os.linux.suse)
  • Weird harddisk behaviour
    ... A couple of weeks ago my 400Gb SATA disk crashed. ... Partition Table for /dev/sda ... # Type Sector Sector Offset Length Filesystem Type Flag ... Superblock backups stored on blocks: ...
    (Linux-Kernel)
  • Re: [patch] ext2/3: document conditions when reliable operation is possible
    ... the data on the filesystem has not been horribly mangled. ... further writes to the disk can trash unrelated existing data because it's ... disk can trash unrelated existing data _anyway_, because the flash block ... Today we have cheap plentiful USB keys that act like hard drives, ...
    (Linux-Kernel)
  • Notes to filesystem developers....
    ... I want to publish some notes about filesystem performance. ... takes 0.7 seconds (after caching all the disk accesses). ... that comes to 4.3Gb of inode data. ... (assuming the block groups haven't grown as I'm advocating below). ...
    (Linux-Kernel)
  • Re: Recommendations for servers running SATA drives
    ... BIOS menu item for disabling said feature. ... I am taking my chances with multiple affordable drives ... on a disk will be enabled, ... that fact ignored, then the filesystem is either 1) worthless, or 2) ...
    (freebsd-stable)