.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/chain.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_chain.py: Chaining Operations =================== Often, a data processing pipeline looks like the following: #. Apply a blocked mean or median to the data #. Remove a trend from the blocked data #. Fit a spline to the residual of the trend #. Grid using the spline and restore the trend The :class:`verde.Chain` class allows us to created gridders that perform multiple operations on data. Each step in the chain filters the input and passes the result along to the next step. For gridders and trend estimators, filtering means fitting the model and passing along the residuals (input data minus predicted data). When predicting data, the predictions of each step are added together. Other operations, like :class:`verde.BlockReduce` and :class:`verde.BlockMean` change the input data values and the coordinates but don't impact the predictions because they don't implement the :meth:`~verde.base.BaseGridder.predict` method. .. note:: The :class:`~verde.Chain` class was inspired by the :class:`sklearn.pipeline.Pipeline` class, which doesn't serve our purposes because it only affects the feature matrix, not what we would call *data* (the target vector). For example, let's create a pipeline to grid our sample bathymetry data. .. GENERATED FROM PYTHON SOURCE LINES 39-68 .. code-block:: default import cartopy.crs as ccrs import matplotlib.pyplot as plt import numpy as np import pyproj import verde as vd data = vd.datasets.fetch_baja_bathymetry() region = vd.get_region((data.longitude, data.latitude)) # The desired grid spacing in degrees spacing = 10 / 60 # Use Mercator projection because Spline is a Cartesian gridder projection = pyproj.Proj(proj="merc", lat_ts=data.latitude.mean()) proj_coords = projection(data.longitude.values, data.latitude.values) plt.figure(figsize=(7, 6)) ax = plt.axes(projection=ccrs.Mercator()) ax.set_title("Bathymetry from Baja California") plt.scatter( data.longitude, data.latitude, c=data.bathymetry_m, s=0.1, transform=ccrs.PlateCarree(), ) plt.colorbar().set_label("meters") vd.datasets.setup_baja_bathymetry_map(ax) plt.show() .. image-sg:: /tutorials/images/sphx_glr_chain_001.png :alt: Bathymetry from Baja California :srcset: /tutorials/images/sphx_glr_chain_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 69-71 We'll create a chain that applies a blocked median to the data, fits a polynomial trend, and then fits a standard gridder to the trend residuals. .. GENERATED FROM PYTHON SOURCE LINES 71-81 .. code-block:: default chain = vd.Chain( [ ("reduce", vd.BlockReduce(np.median, spacing * 111e3)), ("trend", vd.Trend(degree=1)), ("spline", vd.Spline()), ] ) print(chain) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Chain(steps=[('reduce', BlockReduce(reduction=, spacing=18500.0)), ('trend', Trend(degree=1)), ('spline', Spline())]) .. GENERATED FROM PYTHON SOURCE LINES 82-88 Calling :meth:`verde.Chain.fit` will automatically run the data through the chain: #. Apply the blocked median to the input data #. Fit a trend to the blocked data and output the residuals #. Fit the spline to the trend residuals .. GENERATED FROM PYTHON SOURCE LINES 88-91 .. code-block:: default chain.fit(proj_coords, data.bathymetry_m) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Chain(steps=[('reduce', BlockReduce(reduction=, spacing=18500.0)), ('trend', Trend(degree=1)), ('spline', Spline())]) .. GENERATED FROM PYTHON SOURCE LINES 92-101 Now that the data has been through the chain, calling :meth:`verde.Chain.predict` will sum the results of every step in the chain that has a ``predict`` method. In our case, that will be only the :class:`~verde.Trend` and :class:`~verde.Spline`. We can verify the quality of the fit by inspecting a histogram of the residuals with respect to the original data. Remember that our spline and trend were fit on decimated data, not the original data, so the fit won't be perfect. .. GENERATED FROM PYTHON SOURCE LINES 101-111 .. code-block:: default residuals = data.bathymetry_m - chain.predict(proj_coords) plt.figure() plt.title("Histogram of fit residuals") plt.hist(residuals, bins="auto", density=True) plt.xlabel("residuals (m)") plt.xlim(-1500, 1500) plt.show() .. image-sg:: /tutorials/images/sphx_glr_chain_002.png :alt: Histogram of fit residuals :srcset: /tutorials/images/sphx_glr_chain_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 112-115 Likewise, :meth:`verde.Chain.grid` creates a grid of the combined trend and spline predictions. This is equivalent to a *remove-compute-restore* procedure that should be familiar to the geodesists among us. .. GENERATED FROM PYTHON SOURCE LINES 115-125 .. code-block:: default grid = chain.grid( region=region, spacing=spacing, projection=projection, dims=["latitude", "longitude"], data_names="bathymetry", ) print(grid) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /usr/share/miniconda3/envs/test/lib/python3.9/site-packages/verde/base/base_classes.py:463: FutureWarning: The 'spacing', 'shape' and 'region' arguments will be removed in Verde v2.0.0. Please use the 'verde.grid_coordinates' function to define grid coordinates and pass them as the 'coordinates' argument. warnings.warn( Dimensions: (latitude: 61, longitude: 59) Coordinates: * longitude (longitude) float64 245.0 245.2 245.3 ... 254.4 254.5 254.7 * latitude (latitude) float64 20.0 20.17 20.33 20.5 ... 29.66 29.82 29.99 Data variables: bathymetry (latitude, longitude) float64 -3.621e+03 ... 8.798e+03 Attributes: metadata: Generated by Chain(steps=[('reduce',\n BlockReduc... .. GENERATED FROM PYTHON SOURCE LINES 126-127 Finally, we can plot the resulting grid: .. GENERATED FROM PYTHON SOURCE LINES 127-138 .. code-block:: default plt.figure(figsize=(7, 6)) ax = plt.axes(projection=ccrs.Mercator()) ax.set_title("Gridded result of the chain") pc = grid.bathymetry.plot.pcolormesh( ax=ax, transform=ccrs.PlateCarree(), vmax=0, zorder=-1, add_colorbar=False ) plt.colorbar(pc).set_label("meters") vd.datasets.setup_baja_bathymetry_map(ax) plt.show() .. image-sg:: /tutorials/images/sphx_glr_chain_003.png :alt: chain :srcset: /tutorials/images/sphx_glr_chain_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 139-142 Each component of the chain can be accessed separately using the ``named_steps`` attribute. It's a dictionary with keys and values matching the inputs given to the :class:`~verde.Chain`. .. GENERATED FROM PYTHON SOURCE LINES 142-146 .. code-block:: default print(chain.named_steps["trend"]) print(chain.named_steps["spline"]) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Trend(degree=1) Spline() .. GENERATED FROM PYTHON SOURCE LINES 147-150 All gridders and estimators in the chain have been fitted and can be used to generate grids and predictions. For example, we can get a grid of the estimated trend: .. GENERATED FROM PYTHON SOURCE LINES 150-169 .. code-block:: default grid_trend = chain.named_steps["trend"].grid( region=region, spacing=spacing, projection=projection, dims=["latitude", "longitude"], data_names="bathymetry", ) print(grid_trend) plt.figure(figsize=(7, 6)) ax = plt.axes(projection=ccrs.Mercator()) ax.set_title("Gridded trend") pc = grid_trend.bathymetry.plot.pcolormesh( ax=ax, transform=ccrs.PlateCarree(), zorder=-1, add_colorbar=False ) plt.colorbar(pc).set_label("meters") vd.datasets.setup_baja_bathymetry_map(ax) plt.show() .. image-sg:: /tutorials/images/sphx_glr_chain_004.png :alt: chain :srcset: /tutorials/images/sphx_glr_chain_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /usr/share/miniconda3/envs/test/lib/python3.9/site-packages/verde/base/base_classes.py:463: FutureWarning: The 'spacing', 'shape' and 'region' arguments will be removed in Verde v2.0.0. Please use the 'verde.grid_coordinates' function to define grid coordinates and pass them as the 'coordinates' argument. warnings.warn( Dimensions: (latitude: 61, longitude: 59) Coordinates: * longitude (longitude) float64 245.0 245.2 245.3 ... 254.4 254.5 254.7 * latitude (latitude) float64 20.0 20.17 20.33 20.5 ... 29.66 29.82 29.99 Data variables: bathymetry (latitude, longitude) float64 -4.911e+03 ... 2.264e+03 Attributes: metadata: Generated by Trend(degree=1) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 5.580 seconds) .. _sphx_glr_download_tutorials_chain.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: chain.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: chain.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_