Using Ensaio in your project
Contents
Using Ensaio in your project¶
One of the main use cases of Ensaio is to provide reproducible and easy-to-access data for the documentation of other Python projects. These are a few tips and tricks for using Ensaio in your own project.
Explicitly set data versions¶
New version of each dataset may be included in new Ensaio releases. We’ll do our very best to always keep the older data versions available as well to avoid breaking existing tutorials and documentation.
We recommend always explicitly setting the data version when fetching a dataset:
import ensaio
fname = ensaio.fetch_southern_africa_gravity(version=1)
This way, your documentation/tutorial should still use the same data (and hopefully still produce the same result) even if new versions of Ensaio are installed. Otherwise, people going through older examples with newer versions of Ensaio could get different results (or worse, broken code).
Tip
We still recommend updating to the latest data versions in new tutorials and documentation whenever you can.
Download from GitHub on CI¶
By default, the data sources for Ensaio are the archives with the given DOIs
for each dataset (usually
Zenodo).
Alternatively, you can ask Ensaio to download from the GitHub release of each
dataset by setting the environment variable ENSAIO_DATA_FROM_GITHUB=true
.
We recommend using the environment variable when running on continuous integration (CI). This will minimize the load that is placed on public data servers like Zenodo. When using GitHub Actions, this may even make the downloads much faster since the data source is likely physically closer to the CI infrastructure.