Re: OT: wget bug



On Sat, 18 Jul 2009, Andrew Brampton wrote:

Date: Sat, 18 Jul 2009 18:09:54 +0100
From: Andrew Brampton <brampton+freebsd@xxxxxxxxx>
To: Joe R. Jah <jjah@xxxxxxxxxxxxxxxxxxx>
Cc: freebsd-questions@xxxxxxxxxxx
Subject: Re: OT: wget bug

2009/7/18 Joe R. Jah <jjah@xxxxxxxxxxxxxxxxxxx>:
Thank you Andrew.  Yes the server is truly returning 401.  I have already
reconfigured wget to download everything regardless of their timestamp,
but it's a waste of bandwidth, because most of the site is unchanged.

Do you know of any workaround in wget, or an alternative tool to ONLY
download newer files by http?


Joe,
There are two ways to check if the file has been changed. One, read
the time the file was last changed, or two, read the file and compare
it to a old copy. Wget was obviously trying to do option 1 but this is
denied by the remote server. You most likely could get it to do option
2, however by doing so you are wasting bandwidth downloading unchanged
files just to check if they had been changed.

If you have control over the remote webserver, then the simplest way
to solve this problem is to configure the webserver not to return 401
when wget sends the If-Modified-Since HTTP header. A better solution,
again assuming you have control of the remote server, is to use
"rsync" as it is designed for this kind of task.

If you don't have control over the remote server, then you are stuck
with your current solution.

Andrew

Thank you Andrew.

Regards,

Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah jjah@xxxxxxxxxxxxxxxxxxx_______________________________________________
freebsd-questions@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@xxxxxxxxxxx"

Relevant Pages

  • Re: OT: wget bug
    ... reconfigured wget to download everything regardless of their timestamp, ... but it's a waste of bandwidth, because most of the site is unchanged. ... Do you know of any workaround in wget, or an alternative tool to ONLY ... again assuming you have control of the remote server, ...
    (freebsd-questions)
  • Re: Free Metalworking Plans
    ... NRA LOH & Endowment Member, Golden Eagle, Patriot"s Medal. ... | I Dl'd the lathe plans, ... Use something like Wget ... or some other worthy download manager to retrieve it. ...
    (rec.crafts.metalworking)
  • Wget usage : request for comments
    ... I am going to start a small project to analyze ... 8 websites with hyperlinks, images, js, etc.. ... I will use wget as a crawler (I like command ... seems I can't download the results in text format. ...
    (comp.os.linux.misc)
  • I/O or CPU bandwidth issue or wget issue or perhaps isp???
    ... negligible when I start wget. ... a while chokes as above. ... The download speeds fluctuate as low as a few hundred ...
    (Fedora)
  • Re: I/O or CPU bandwidth issue or wget issue or perhaps isp???
    ... negligible when I start wget. ... a while chokes as above. ... The download speeds fluctuate as low as a few ...
    (Fedora)