6
$\begingroup$

I'm using the new (in V11) function UrlDownload[]

enter image description here

I have a url I'd like to save to a local file, but the url is missing a (or has the incorrect) filetype extension. So I don't know how to construct the filename properly. Here's an example:

url = "http://t0.gstatic.com/images?q=tbn:ANd9GcRf1EqE2kyW12HSb9gZZ8eTIPqNgVkjFis4GkTTYONIpoQtkIde4zybZ4iAqGlIHQ_pnEX499Oa";
Import@url
URLDownload[url, "~/Downloads/" <> ToString@RandomInteger[1000000]]

enter image description here

I'd like to save it with the proper extension though. Any thoughts?

Also, URLDownload doesn't seem to be multithreaded, this is waaay too slow:

enter image description here

$\endgroup$
2
  • $\begingroup$ For a faster way to download files check out URLSaveAsynchronous. If the file type appears in the URL or not is immaterial, just append the right file extension to your file name. $\endgroup$ Commented Sep 27, 2016 at 16:06
  • $\begingroup$ @C.E. and what is the right extension (and how do you detect that)? $\endgroup$ Commented Sep 27, 2016 at 16:35

1 Answer 1

5
$\begingroup$

If I was downloading different types of images and I couldn't find out what their file types are a priori I'd do something like this perhaps:

url = "http://t0.gstatic.com/images?q=tbn:\
ANd9GcRf1EqE2kyW12HSb9gZZ8eTIPqNgVkjFis4GkTTYONIpoQtkIde4zybZ4iAqGlIHQ\
_pnEX499Oa";

callback[file_][_, "data", _] := (
  If[FileFormat[file] == "JPEG", CopyFile[file, "path/to/new/image.jpg"]];
  DeleteFile[file];
  )

file = CreateFile[];
URLSaveAsynchronous[url, file, callback[file]];

URLSaveAsynchronous is an asynchronous operation as the name suggests, which means that you can download many files simultaneously. I use CreateFile to create an empty file in my operating system's directory for temporary files and pass a custom callback function, using that file, to URLSaveAsynchronous. When URLSaveAsynchronous has downloaded the file it calls the callback function, which subsequently detects the file format using FileFormat, copies the file to wherever it's supposed to go, with the right file extension, and then deletes the file in the temporary directory.

$\endgroup$
4
  • $\begingroup$ you can get an idea of the extension via the following: Last@StringSplit[(Association@@URLRead[url,"Headers"])["content-type"],"/"] $\endgroup$ Commented Sep 27, 2016 at 18:58
  • $\begingroup$ @chuy ok, but why would I want to do that though? He says he needs to be able to download files in parallel, and URLRead is not asynchronous. $\endgroup$ Commented Sep 27, 2016 at 19:07
  • $\begingroup$ URLSubmit is asynchronous and can tell you the "ContentType" directly. I've never used it through, nor am I familiar with how content types work. $\endgroup$ Commented Oct 9, 2016 at 13:04
  • 1
    $\begingroup$ FileFormat is nice though because it looks at the magic bytes. $\endgroup$ Commented Oct 9, 2016 at 13:05

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.