Downloading a single file¶
Sometimes, you just want to download a single file (caching it locally and
checking the hash to make sure you have the right one). In these cases,
pooch.Pooch
is overkill and requires too much setup.
For that, pooch.retrieve
is what you want:
from pooch import retrieve
# Download the file and save it locally. Will check the MD5 checksum of
# the downloaded file against the given value to make sure it's the right
# file. You can use other hashes by specifying different algorithm
# names (sha256, sha1, etc).
fname = retrieve(
# URL to one of Pooch's test files
url="https://github.com/fatiando/pooch/raw/v1.0.0/data/tiny-data.txt",
known_hash="md5:70e2afd3fd7e336ae478b1e740a5f08e",
)
The file is stored locally, by default in a pooch
folder in the default
cache location of your operating system (see pooch.os_cache
).
Running this code a second time will not trigger a download, same as with
pooch.Pooch.fetch
.
If you don’t know the hash of the file, you can set known_hash=None
to
bypass the check. If this is the case, retrieve
will show a log
message with the SHA256 hash of the downloaded file. It’s highly recommended
that you copy and paste this hash into your code and use it as the
known_hash
. That way, the next time your code is run (by you or someone
else) you can guarantee that the exact same file is downloaded. This is a way
to help make sure the results of your code are reproducible.
Function retrieve
has support for all of Pooch’s
custom downloaders and
post-processing hooks. So you can use HTTP and FTP (with or
without authentication), decompress files, unpack archives, and print progress
bars with a bit of configuration.
Note
This function is meant for downloading single files. If you need to
manage the download and caching of several files, with versioning, use
pooch.create
and pooch.Pooch
instead. See Training your Pooch.