Here’s a snippet I used to download some images from wikipedia.
Usage:
perl younameit.pl WikipediaFilePage LocalName
where WikipediaFilePage is an url like « https://en.wikipedia.org/wiki/File:Bournemouth,_The_Square.jpg »
which describes an image that you want to download at full res, and LocalName is the name you want to give it on your harddisk. You can give as many « WikipediaFilePage LocalName » pairs as you like.
KNOWN BUGS:
– Any error will stop everything (in case there are multiples pairs on the command line).
– There’s a huge memory leak (thanks to feldspath for deeply analyzing my code :D)
– This list is very much incomplete.
#!/usr/bin/perl -w
use WWW::Mechanize;
use strict;
use HTML::TreeBuilder;
binmode STDOUT, ":utf8"; # spit utf8 to terminal
use utf8; # allow for utf8 inside the code.
my $url;
while (defined($url = shift)) {
my $mech = WWW::Mechanize->new;
my $tree = HTML::TreeBuilder->new;
my $tempfile = shift; ## user supplied
print "Downloading $url...";
$mech->get($url) and print "done !\n" or die;
$tree->parse_content($mech->content());
# real image:
$url = $tree->look_down(class=>'fullImageLink')->look_down(_tag => "a")->attr('href');
print "Downloading $url to $tempfile...";
$mech->get($url, ':content_file' => $tempfile) and print "done !\n" or die;
}