Re: dd(1) performance when copying a disk to another

From: Tulio Guimarăes da Silva (tuliogs_at_pgt.mpt.gov.br)
Date: 10/03/05

  • Next message: Bruce Evans: "Re: dd(1) performance when copiing a disk to another"
    Date: Mon, 03 Oct 2005 12:08:31 -0300
    To: freebsd-performance@freebsd.org
    
    
    

      Phew, thanks for that. :) This seems to answer my question in the
    other "leg" of the thread, though it hadn´t yet arrived to me when I
    wrote the message, though.
      Now THAT´s a quite good explanation. ;) Thanks again,

    Tulio G. da Silva

    Bruce Evans wrote:

    > On Mon, 3 Oct 2005, Patrick Proniewski wrote:
    >
    >>>>> # dd if=/dev/ad4 of=/dev/null bs=1m count=1000
    >>>>> 1000+0 records in
    >>>>> 1000+0 records out
    >>>>> 1048576000 bytes transferred in 17.647464 secs (59417943
    >>>>> bytes/sec)
    >>>>
    >
    > Many wrong answers to the original question have been given. dd with
    > a blocks size of 1m between (separate) disk devices is much slower
    > just because that block size is far too large...
    >
    > The above is a fairly normal speed. The expected speed depends mainly
    > on the disk technology generation and the placement of the sectors being
    > read. I get the following speeds for _sequential_ _reading- from the
    > outer (fastest) tracks of 6- and 3-year old drives which are about 2
    > generations apart:
    >
    > %%%
    > Sep 25 21:52:35 besplex kernel: ad0: 29314MB <IBM-DTLA-307030>
    > [59560/16/63] at ata0-master UDMA100
    > Sep 25 21:52:35 besplex kernel: ad2: 58644MB <IC35L060AVV207-0>
    > [119150/16/63] at ata1-master UDMA100
    > ad0 bs 512: 16777216 bytes transferred in 2.788209 secs (6017201
    > bytes/sec)
    > ad0 bs 1024: 16777216 bytes transferred in 1.433675 secs (11702245
    > bytes/sec)
    > ad0 bs 2048: 16777216 bytes transferred in 0.787466 secs (21305320
    > bytes/sec)
    > ad0 bs 4096: 16777216 bytes transferred in 0.479757 secs (34970249
    > bytes/sec)
    > ad0 bs 8192: 16777216 bytes transferred in 0.477803 secs (35113250
    > bytes/sec)
    > ad0 bs 16384: 16777216 bytes transferred in 0.462006 secs (36313842
    > bytes/sec)
    > ad0 bs 32768: 16777216 bytes transferred in 0.462038 secs (36311331
    > bytes/sec)
    > ad0 bs 65536: 16777216 bytes transferred in 0.486850 secs (34460748
    > bytes/sec)
    > ad0 bs 131072: 16777216 bytes transferred in 0.462046 secs (36310693
    > bytes/sec)
    > ad0 bs 262144: 16777216 bytes transferred in 0.469866 secs (35706382
    > bytes/sec)
    > ad0 bs 524288: 16777216 bytes transferred in 0.462035 secs (36311555
    > bytes/sec)
    > ad0 bs 1048576: 16777216 bytes transferred in 0.478534 secs (35059612
    > bytes/sec)
    > ad2 bs 512: 16777216 bytes transferred in 4.115675 secs (4076419
    > bytes/sec)
    > ad2 bs 1024: 16777216 bytes transferred in 2.105451 secs (7968466
    > bytes/sec)
    > ad2 bs 2048: 16777216 bytes transferred in 1.132157 secs (14818809
    > bytes/sec)
    > ad2 bs 4096: 16777216 bytes transferred in 0.662452 secs (25325935
    > bytes/sec)
    > ad2 bs 8192: 16777216 bytes transferred in 0.454654 secs (36901065
    > bytes/sec)
    > ad2 bs 16384: 16777216 bytes transferred in 0.304761 secs (55050416
    > bytes/sec)
    > ad2 bs 32768: 16777216 bytes transferred in 0.304761 secs (55050416
    > bytes/sec)
    > ad2 bs 65536: 16777216 bytes transferred in 0.304765 secs (55049683
    > bytes/sec)
    > ad2 bs 131072: 16777216 bytes transferred in 0.304762 secs (55050200
    > bytes/sec)
    > ad2 bs 262144: 16777216 bytes transferred in 0.304760 secs (55050588
    > bytes/sec)
    > ad2 bs 524288: 16777216 bytes transferred in 0.304762 secs (55050200
    > bytes/sec)
    > ad2 bs 1048576: 16777216 bytes transferred in 0.304757 secs (55051148
    > bytes/sec)
    > %%%
    >
    > Drive technology hit a speed plateau a few years ago so newer single
    > drives
    > aren't much faster unless they are more expensive and/or smaller.
    >
    > The speed is low for small block sizes because the device has to be
    > talked too too much and the protocol and firmware are not very good.
    > (Another drive, a WDC 120GB with more cache (8MB instead of 2), ramps
    > up to about half speed (26MB/sec) for a block size of 4K but sticks
    > at that speed for block sizes 8K and 16K, then jumps up to full speed
    > for a block sizes of 32K and larger. This indicates some firmware
    > stupidness). Most drives ramp up almost logarithmically (doubling
    > the block size almost doubles the speed). This behaviour is especially
    > evident on slow SCSI drives like some (most?) ZIP and dvd/cd. The
    > command overhead can be 20 msec, so you had better not do 1 512 bytes
    > of i/o per command or you will get a speed of 25K/sec. The command
    > overhead of a new ATA drive is more like 50 usec, but that is still
    > far too much for high speed with a block size of 512 bytes.
    >
    > The speed is insignificantly different for block sizes larger than a
    > limit because the drive's physical limits dominate except possibly
    > with old (slow) CPUs.
    >
    >>>> That seems to be 2 or about 2 times faster than disc->disc
    >>>> transfer... But still slower, than I would have expected...
    >>>> SATA150 sounds like the drive can do 150MB/sec...
    >>>
    >>
    >> As Eric pointed out, you just can"t reach 150 MB/s with one disk,
    >> it's a technological maximum for the bus, but real world performance
    >> is well bellow this max.
    >> In fact, I've though I would reach about 50 to 60 MB/s.
    >
    >
    > 50-60 MB/s is about right. I haven't benchmarked any SATA or very new
    > drives. Apparently they are not much faster. ISTR that WDC Raptors are
    > speced for 70-80MB/sec. You pay twice as much to get a tiny drive with
    > only 25% more throughput plus faster seeks.
    >
    >>>>>> (Maybe you could find a way to copy /dev/zero to /dev/ad6
    >>>>>> without destroying the previous work... :-))
    >>>>>
    >>>>>
    >>>>> well, not very easy both disk are the same size ;)
    >>>>
    >>
    >>>> I thought of the first 1000 1MB blocks... :-)
    >>>
    >>
    >> damn, I misread this one... :)
    >> I'm gonna try this asap.
    >
    >
    > I divide disks into equally sized (fairly small, or half the disk size)
    > partitions, and cp between them. dd is too hard to use for me ;-). cp
    > is easier to type and automatically picks a reasonable block size. Of
    > course I use dd if the block size needs to be controlled, but mostly I
    > only use it in preference to cp to get its timing info.
    >
    >> ...
    >>
    >>> Have you tried a smaller block size? What does 8k, 16k, or 512k do
    >>> for you? There really isn't much room for improvement here on a
    >>> single device.
    >>
    >>
    >> nop, I'll try one of them, but I can't do many experiments, the box
    >> is in my living room, it's a 1U rack, and it's VERY VERY noisy. My
    >> girlfriend will kill me if it's running more than an hour a day :))
    >
    >
    > Smaller block sizes will go much faster, except for copying from a
    > disk to
    > itself. Large block sizes are normally a pessimization and the
    > pessimization
    > is especially noticeable for dd. Just use the smallest block size
    > that gives
    > an almost-maximal throughput (e.g., 16K for reading ad2 above, possibly
    > different for writing). Large block sizes are pessimal for synchronous
    > i/o like dd does. The timing for dd'ing blocks of size N MB at R MB/sec
    > between ad0 and ad2 is something like:
    >
    > time in secs activity on ad0 activity on ad2
    > ------------ --------------- ---------------
    > 0 start read of 1MB idle
    > N/R finish read; idle start write of 1MB
    > N/R-epsilon start read of 1MB pretend to complete write
    > N/R continue read complete write
    > N/R-epsilon finish read; idle start write of 1MB
    > N/R-2*epsilon ... ...
    >
    > After the first block (which takes a little longer), it takes N/R-epsilon
    > seconds to copy 1 block, where epsilon is the time between the writer's
    > pretending to complete the write and actually completing it. This time
    > is obviously not very dependent on the block size since it is limited by
    > drives resources and policies (in particular, if the drive doesn't do
    > write
    > caching, perhaps because write caching is not enabled, then epsilon is 0,
    > and if out block size is large compared with the drive's cache then the
    > drive won't be able to signal completion until no more than the drive's
    > cache size is left to do). Thus epsilon becomes small relative to the
    > N/R term when N is large. Apparently, in your case the speed drops from
    > 59MB/sec to 35MB/sec, so with N == 1 and R == 59, epsilon is about 1/200.
    >
    > With large block sizes, the speed can be increased using asyncronous
    > output.
    > There is a utility (in ports) named team that fakes async output using
    > separate processes. I have never used it. Somthing as simple as 2
    > dd's in a pipe should work OK.
    >
    > For copying from a disk itself, a large block sizes is needed to limit
    > the
    > number of seeks, and concurrent reads and writes are exactly what is not
    > needed (since they would give competing seeks). The i/o must be
    > sequentialized, and dd does the right things for this, though the drive
    > might not (you would prefer epsilon == 0, since if the drive signals
    > write completion early then it might get confused when you flood it
    > with the next read and seek to start the read before it completes the
    > write, then thrash back and forth between writing and reading).
    >
    > It is interesting that writing large sequential files to at least the
    > ffs file system (not mounted with -sync) in FreeBSD is slightly faster
    > than writing directly to the raw disk using write(2), even if the
    > device driver sees almost the same block sizes for these different
    > operations. This is because write(2) is synchronous and sync writes
    > always cause idle periods (the idle periods are just much smaller for
    > writing data that is already in memory), while the kernel uses async
    > writes for data.
    >
    > Bruce
    > _______________________________________________
    > freebsd-performance@freebsd.org mailing list
    > http://lists.freebsd.org/mailman/listinfo/freebsd-performance
    > To unsubscribe, send any mail to
    > "freebsd-performance-unsubscribe@freebsd.org"
    >
    >

    
    

    _______________________________________________
    freebsd-performance@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-performance
    To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"


  • Next message: Bruce Evans: "Re: dd(1) performance when copiing a disk to another"

    Relevant Pages

    • Re: dd(1) performance when copiing a disk to another
      ... on the disk technology generation and the placement of the sectors being ... I get the following speeds for _sequential_ _reading- from the ... Drive technology hit a speed plateau a few years ago so newer single drives ... The speed is low for small block sizes because the device has to be ...
      (freebsd-performance)
    • Re: maxphys and block sizes on slices
      ... 64kb/transfer even if dfltphys and maxphys are increased. ... DFLTPHYS is likely to break some drives. ... learned tho new servers I setup I will keep the default block sizes ...
      (freebsd-performance)
    • Re: [opensuse] Raid5/LVM2/XFS alignment
      ... drives. ... I think and that is totally inefficient with the linux kernel. ... out what sort of real world block sizes your going to be using. ... Litigation Triage Solutions Specialist ...
      (SuSE)
    • Re: dd too slow
      ... minutes when the drives are on different controllers as yours are. ... For block sizes over 4k, the 'bs' value does not make any significant ... > would hdparam do any good? ...
      (comp.os.linux.misc)
    • Re: Secure delete with python
      ... What OSen are known for [writing new content at ... file would stay on the disk. ... The VMS systems always kept an old copy of the file around ... DEC to sell more hard drives. ...
      (comp.lang.python)