ICESat-2 interpolation workflow#

Summary Producing the ICESat-2 interpolated data variables followed a series of steps: using linear interpolation to fill in missing data, smoothing the data using gaussian smoothing, and then restricting the resulting interpolated/smoothed data using a KDTree method. Below, we detail the methods undertaken and demonstrate the effect each step has on the raw sea ice thickness data.

Version history: Version 1 (01/01/2022)

import xarray as xr # For working with gridded climate data 
import pandas as pd
import numpy as np
from utils.read_data_utils import read_book_data # Helper function for reading the data from the bucket
from utils.plotting_utils import interactiveArcticMaps

# Interpolating/smoothing packages 
from scipy.interpolate import griddata
from scipy.spatial import KDTree
from astropy.convolution import convolve
from astropy.convolution import Gaussian2DKernel

# Plotting dependencies
import cartopy.crs as ccrs
import holoviews as hv
import matplotlib.pyplot as plt
from matplotlib.axes import Axes
from cartopy.mpl.geoaxes import GeoAxes
GeoAxes._pcolormesh_patched = Axes.pcolormesh # Helps avoid some weird issues with the polar projection 
%config InlineBackend.figure_format = 'retina'
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 150 # Sets figure size in the notebook

# Remove warnings to improve display
import warnings 
warnings.filterwarnings('ignore') 

Read in the data#

Here, we’ll read in the data from the book dataset, and then grab the data variables and their associated grid coordinated (xgrid and ygrid) as numpy arrays.

For simplicity, we’ll just use data from a single month in this notebook, but this method has been applied across all months in the interpolated data variables in the book dataset.

# Read in data 
book_ds = read_book_data() 
cdr_da = book_ds["sea_ice_conc"]
is2_da = book_ds["ice_thickness"]

# Just grab one month 
date = "Apr 2019"
cdr_da = cdr_da.sel(time=date)[0]
is2_da = is2_da.sel(time=date)[0]

# Grab data as numpy arrays 
xgrid = is2_da['xgrid'].values
ygrid = is2_da['ygrid'].values
is2_np = is2_da.copy().values
cdr_np = cdr_da.copy().values

Step 1: Linear interpolation#

First, we use the scipy.griddata package to perform a simple linear interpolation of the data. We’ll also use the CDR sea ice concentration data to assign all sea ice concentration < 15% to 0 thickness before the interpolation, an assumption allowing us to improve the interpolation results. Following the interpolation, we’ll set all interpolated grid cells with < 50% concentration to 0 thickness, following the fact that ICESat2 should not produce results for ice thickness where concentration is less that 50%. We’ll also set all interpolated grid cells to nan where the CDR data is nan.

# Interpolation settings 
method = "linear" 

# Perform linear interpolation 
is2_to_interp = is2_np.copy()
is2_to_interp[np.where(cdr_np<0.15)] = 0 # Set 15% conc or less to 0 thickness
np_interpolated = griddata((xgrid[(np.isfinite(is2_to_interp))], 
                            ygrid[(np.isfinite(is2_to_interp))]), 
                            is2_to_interp[(np.isfinite(is2_to_interp))].flatten(),
                            (xgrid, ygrid), 
                            fill_value=np.nan,
                            method=method)
np_interpolated[~(np.isfinite(cdr_np))]=np.nan # Remove thickness data where cdr data is nan 
np_interpolated[np.where(cdr_np<0.5)]=np.nan # Remove thickness data where cdr data < 50% concentration

Next, we’ll convert the numpy array np_interpolated to an xarray DataArray object, allowing us to easily map the results using the interactiveArcticMaps function defined in the plotting_utils module.

step1_interp_da = xr.DataArray(data=np_interpolated, 
                               dims=is2_da.dims, 
                               coords=is2_da.coords, 
                               attrs={**is2_da.attrs}, 
                               name=is2_da.name+"_interp")

Lastly, we’ll plot the raw and interpolated data on a map to show the effect of linear interpolation. You’ll see how all regions with sea ice thickness > 50% concentration have been filled with a best guess from the linear interpolation, including regions like the Laptev Sea and the Canadian Archipielago, which didn’t have much nearby data to work from. The KDTree implemented in step 3 will help resolve this issue.

pl_raw = interactiveArcticMaps(is2_da, title = "IS2 raw ice thickness data ("+date+")", vmin=0, vmax=4, frame_width=350, colorbar=True)
pl_step1_interp = interactiveArcticMaps(step1_interp_da, title="Step 1: "+method+" interpolation", vmin=0, vmax=4, frame_width=350, colorbar=False)
display(pl_step1_interp+pl_raw)

Step 2: Gaussian smoothing#

Next, we’ll smooth the interpolated data using the Gaussian2DKernel function from the astropy python package.

x_stddev=0.5
kernel = Gaussian2DKernel(x_stddev=x_stddev)
np_interpolated_gauss = convolve(np_interpolated, kernel)
np_interpolated_gauss[~(np.isfinite(cdr_np))]=np.nan # Remove thickness data where cdr data is nan 
np_interpolated_gauss[np.where(cdr_np<0.5)]=np.nan # Remove thickness data where cdr data < 50% concentration

Below, we’ll plot the gaussian smoothed data (step 2) and the interpolated data without smoothing (step 1). We’ve used a low standard deviation in our smoothing, so the differences are somewhat subtle.

step2_gauss_da = xr.DataArray(data=np_interpolated_gauss, 
                              dims=is2_da.dims, 
                              coords=is2_da.coords, 
                              attrs={**is2_da.attrs}, 
                              name=is2_da.name+"_smoothed")
pl_step1_interp = interactiveArcticMaps(step1_interp_da, title="Step 1: "+method+" interpolation", vmin=0, vmax=4, frame_width=350, colorbar=True)
pl_step2_gauss = interactiveArcticMaps(step2_gauss_da, title="Step 2: gaussian smoothing", vmin=0, vmax=4, frame_width=350, colorbar=False)
display(pl_step2_gauss+pl_step1_interp)

Step 3: KDTree#

Lastly, we’ll use a KDTree to compute the distances between every grid cell and the nearest grid cell with data. We’ll set a limitation on the distance that a grid cell should be from the nearest cell with data to remove cells that were interpolated using data from far away. We’ve set the threshold at 50km.

distance_m = 50*10**3 #50km --> 50*10^3 m
polehole_lat = 88.25
xS = xgrid[np.where((np.isfinite(is2_np)))]
yS = ygrid[np.where((np.isfinite(is2_np)))]
grid_points = np.c_[xgrid.ravel(), ygrid.ravel()]
tree = KDTree(np.c_[xS, yS])
dist, _ = tree.query(grid_points, k=1)
dist = dist.reshape(xgrid.shape)
is2_kdtree = np_interpolated_gauss.copy()
is2_kdtree[np.where((dist>distance_m)&(is2_da.latitude.values<polehole_lat))] = np.nan # Set any values > distance_m from the next grid cell to 0. Ignore pole hole
is2_kdtree[~(np.isfinite(cdr_np))]=np.nan # Remove thickness data where cdr data is nan 
is2_kdtree[np.where(cdr_np<0.5)]=np.nan # Remove thickness data where cdr data < 50% concentration

Below, we’ll show the distance matrix computed by the KDTree function

dist_da = xr.DataArray(data=dist, dims=is2_da.dims, coords=is2_da.coords, attrs={"long_name":"distance from closest non-nan cell"}, name="distance")
pl_dist = interactiveArcticMaps(dist_da, title="Distance from closest cell with data", frame_width=350, cmap="coolwarm", vmax=1.5*10**6, colorbar=True)
display(pl_dist)

Next, we’ll plot the results of the KD Tree and show the data that was removed in the process.

step3_kdtree_da = xr.DataArray(data=is2_kdtree, 
                               dims=is2_da.dims, 
                               coords=is2_da.coords, 
                               attrs={**is2_da.attrs}, 
                               name=is2_da.name)
diff_da = step2_gauss_da.where(xr.apply_ufunc(np.isnan, step3_kdtree_da))
pl_step3_kdtree = interactiveArcticMaps(step3_kdtree_da, title="Step 3: KD Tree", vmin=0, vmax=4, frame_width=350, colorbar=False)
pl_diff = interactiveArcticMaps(diff_da, title="Step 3: Data removed by KDTree", vmin=0, vmax=4, frame_width=350, colorbar=True)
display(pl_step3_kdtree+pl_diff)

Compare final interpolated thickness to raw data#

Above, we detailed all the steps taken to arrive at the final interpolated product: we performed a linear interpolation (step 1), we used gaussian smoothing to smooth the data (step 2), and we used a KDTree method to remove data at certain gridcells (step 3). Below, we show the final interpolated ice thickness for one month compared to the raw data from the same month.

pl_final = interactiveArcticMaps(step3_kdtree_da, title="Interpolated/smoothed sea ice thickness", vmin=0, vmax=4, frame_width=350, colorbar=False)
display(pl_final+pl_raw)