Wget is looking for a maintainer

pyre · on April 23, 2010

One issue that I have with wget (and IIRC curl too) is that when downloading a file from a URL like, "http://example.com/file.php?id=1234 it will download the file to 'file.php?id=1234'. This fails because the HTTP headers could specify the filename when returning the data (which wget will ignore) or that URL could be a redirect to the actual URL which contains the filename (but wget blindly uses the first url supplied). I understand that this functionality is probably desired when it comes to wget's mirroring functions (since the src= and href= values won't point to the redirect URL or actual filename) but there is no option to parse out the original file name even if all you are doing is providing a list of URLs to download (not mirror).

{edit} To be fair, this is a pain in the ass to do w/ LWP::UserAgent in Perl too:

  sub download_file_callback {
      my ($response,$useragent,$h) = @_;
      return undef if $response->code >= 300 and $response->code < 400;
      my $fname = $response->filename();
      $useragent->remove_handler('response_header',owner => 'billy');
      return $useragent->get($response->request()->uri(), ':content_file' => $fname);
  }

  $ua->add_handler( response_header => \&download_file_callback, owner => 'billy');
  my $response = $ua->get($url);

sumeeta · on April 23, 2010

cURL ignores the filename header, but Wget actually gets it right by default.

cowsandmilk · on April 23, 2010

lwp-download handles this correctly though, no need to write your own LWP::UserAgent program....

pquerna · on April 22, 2010

a 'maintainer' is a single point of failure.

build a community around a project, look at wget's main competitor, curl, for an example: http://curl.haxx.se/mail/list.cgi?list=curl-library (299 messages so far in april!)

avar · on April 22, 2010

Some of the issues they cite ("doesn't currently handle HTTP authentication as well as it might", "no support for HTTP/1.1") look like they could be solved by using a HTTP library like libcurl instead of rolling their own thing. I wonder if there's political opposition to that (since cURL isn't GNU).

pquerna · on April 22, 2010

yes, but that is what makes a good project design, a good separation of UI from what its doing -- its why libcurl is adopted by dozens of other projects for their http library, while wget doesn't have a properly written separate library for 3rd party use, they significantly reduced the group of people who would contribute.

In addition, because curl is under a BSD/MIT/apcahe style license, you are in a better position to develop a diverse community.

RyanMcGreal · on April 23, 2010

It seems that the best situation combines the two: a BDFL supported by an active community.

marssaxman · on April 22, 2010

Strange to read about this here: Micah is a childhood friend of mine.

I'd still rather use wget than curl, with its clunky syntax, but most places have curl and not wget, so I'm stuck with it.

guelo · on April 23, 2010

Most places? I rather doubt that. Maybe most Mac places. I think most Linux distros and other unixes tend to favor wget.

sandGorgon · on April 23, 2010

what are advantages of wget over curl - warm, fuzzy feeling aside ?

sesqu · on April 23, 2010

http://news.ycombinator.com/item?id=1241479

Jach · on April 23, 2010

I use both, but maybe you can save me a couple minute trip to documentation. What's the equivalent curl command for "wget --mirror"?