climakitae.core package

Contents

climakitae.core package#

Submodules#

climakitae.core.boundaries module#

class climakitae.core.boundaries.Boundaries(boundary_catalog)#

Bases: object

Get geospatial polygon data from the S3 stored parquet catalog. Used to access boundaries for subsetting data by state, county, etc.

Parameters:
  • _us_states (DataFrame) – Table of US state names and geometries

  • _ca_counties (DataFrame) – Table of California county names and geometries Sorted by county name alphabetical order

  • _ca_watersheds (DataFrame) – Table of California watershed names and geometries Sorted by watershed name alphabetical order

  • _ca_utilities (DataFrame) – Table of California IOUs and POUs, names and geometries

  • _ca_forecast_zones (DataFrame) – Table of California Demand Forecast Zones

  • _ca_electric_balancing_areas (DataFrame) – Table of Electric Balancing Areas

boundary_dict()#

This returns a dictionary of lookup dictionaries for each set of geoparquet files that the user might be choosing from. It is used to populate the selector object dynamically as the category in ‘_LocSelectorArea.area_subset’ changes.

Returns:

dict

load()#

climakitae.core.data_export module#

Backend functions for exporting data.

climakitae.core.data_export.export(data, filename='dataexport', format='NetCDF', mode='auto')#

Save xarray data as either a NetCDF or CSV in the current working directory, or stream the export file to an AWS S3 scratch bucket and give download URL. Default behavior is for the code to automatically determine the output destination based on whether file is small enough to fit in HUB user partition, this can be overridden using the mode parameter.

Parameters:
  • data (DataArray or Dataset) – Data to export, as output by e.g. climakitae.Select().retrieve().

  • filename (str, optional) – Output file name (without file extension, i.e. “my_filename” instead of “my_filename.nc”). The default is “dataexport”.

  • format (str, optional) – File format (“NetCDF” or “CSV”). The default is “NetCDF”.

  • mode (str, optional) – Save location logic for NetCDF file (“auto”, “local”, “s3”). The default is “auto”

climakitae.core.data_export.write_tmy_file(filename_to_export, df, location_name, station_code, stn_lat, stn_lon, stn_state, stn_elev=0.0, file_ext='tmy')#

Exports TMY data either as .epw or .tmy file

Parameters:
  • filename_to_export (str) (Filename string, constructed with station name and simulation)

  • df (pd.DataFrame) (Dataframe of TMY data to export)

  • location_name (str) (Location name string, often station name)

  • station_code (int) (Station code)

  • stn_lat (float) (Station latitude)

  • stn_lon (float) (Station longitude)

  • stn_state (str) (State of station location)

  • stn_elev (float, optional) (Elevation of station, default is 0.0)

  • file_ext (str, optional) (File extension for export, default is .tmy, options are "tmy" and "epw")

Returns:

None

climakitae.core.data_interface module#

class climakitae.core.data_interface.DataInterface#

Bases: object

Load data connections into memory once

This is a singleton class called by the various Param classes to connect to the local data and to the intake data catalog and parquet boundary catalog. The class attributes are read only so that the data does not get changed accidentially.

property boundary_catalog#
property data_catalog#
property geographies#
property stations#
property stations_gdf#
property variable_descriptions#
class climakitae.core.data_interface.DataParameters(**params)#

Bases: Parameterized

params(_data_warning=String, _station_data_info=String, area_average=Selector, area_subset=Selector, cached_area=ListSelector, data_type=Selector, downscaling_method=Selector, extended_description=Selector, latitude=Range, longitude=Range, resolution=Selector, scenario_historical=ListSelector, scenario_ssp=ListSelector, simulation=ListSelector, station=ListSelector, time_slice=Range, timescale=Selector, units=Selector, variable=Selector, variable_id=ListSelector, variable_type=Selector, name=String) Python param object to hold data parameters for use in panel GUI. Parameters of ‘DataParameters’ ==============================  Parameters changed from their default values are marked in red. Soft bound values are marked in cyan. C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None

Name Value Type Bounds Mode 

area_subset None Selector V RW cached_area None ListSelector V RW latitude (32.5, 42) Range (10, 67) V RW longitude (-125.5, -114) Range (-156.82317, -84.18701) V RW variable_type ‘Variable’ Selector V RW time_slice (1980, 2015) Range (1950, 2100) V RW resolution ‘9 km’ Selector V RW timescale ‘monthly’ Selector V RW scenario_historical [‘Historical Climate’] ListSelector V RW area_average ‘No’ Selector V RW downscaling_method ‘Dynamical’ Selector V RW data_type ‘Gridded’ Selector V RW station None ListSelector V RW _station_data_info ‘’ String V RW scenario_ssp None ListSelector V RW simulation None ListSelector V RW variable None Selector V RW units None Selector V RW extended_description None Selector V RW variable_id None ListSelector V RW _data_warning ‘’ String V RW

Parameter docstrings: =====================

area_subset: < No docstring available > cached_area: < No docstring available > latitude: < No docstring available > longitude: < No docstring available > variable_type: Choose between variable or AE derived index time_slice: < No docstring available > resolution: < No docstring available > timescale: < No docstring available > scenario_historical: < No docstring available > area_average: Compute an area average? downscaling_method: < No docstring available > data_type: < No docstring available > station: < No docstring available > _station_data_info: Information about the bias correction process and resolution scenario_ssp: < No docstring available > simulation: < No docstring available > variable: < No docstring available > units: < No docstring available > extended_description: < No docstring available > variable_id: < No docstring available > _data_warning: Warning if user has made a bad selection

area_average = 'No'#
area_subset = None#
cached_area = None#
data_type = 'Gridded'#
default_variable = 'Air Temperature at 2m'#
downscaling_method = 'Dynamical'#
extended_description = None#
historical_climate_range_loca = (1950, 2015)#
historical_climate_range_wrf = (1980, 2015)#
historical_climate_range_wrf_and_loca = (1981, 2015)#
historical_reconstruction_range = (1950, 2022)#
latitude = (32.5, 42)#
longitude = (-125.5, -114)#
name = 'DataParameters'#
resolution = '9 km'#
retrieve(config=None, merge=True)#

Retrieve data from catalog

By default, DataParameters determines the data retrieved. To retrieve data using the settings in a configuration csv file, set config to the local filepath of the csv. Grabs the data from the AWS S3 bucket, returns lazily loaded dask array. User-facing function that provides a wrapper for read_catalog_from_csv and read_catalog_from_select.

Parameters:
  • config (str, optional) – Local filepath to configuration csv file Default to None– retrieve settings in selections

  • merge (bool, optional) – If config is TRUE and multiple datasets desired, merge to form a single object? Defaults to True.

Returns:

  • DataArray – Lazily loaded dask array Default if no config file provided

  • Dataset – If multiple rows are in the csv, each row is a data_variable Only an option if a config file is provided

  • list of DataArray – If multiple rows are in the csv and merge=True, multiple DataArrays are returned in a single list. Only an option if a config file is provided.

scenario_historical = ['Historical Climate']#
scenario_ssp = None#
simulation = None#
ssp_range = (2015, 2100)#
station = None#
time_slice = (1980, 2015)#
timescale = 'monthly'#
unit_options_dict = {'K': ['K', 'degC', 'degF'], 'Pa': ['Pa', 'hPa', 'mb', 'inHg'], '[0 to 100]': ['[0 to 100]', 'fraction'], 'degC': ['K', 'degC', 'degF'], 'degF': ['K', 'degC', 'degF'], 'g/kg': ['g/kg', 'kg/kg'], 'hPa': ['Pa', 'hPa', 'mb', 'inHg'], 'kg kg-1': ['kg kg-1', 'g kg-1'], 'kg m-2 s-1': ['kg m-2 s-1', 'mm', 'inches'], 'kg/kg': ['kg/kg', 'g/kg'], 'm s-1': ['m s-1', 'mph', 'knots'], 'm/s': ['m/s', 'mph', 'knots'], 'mm': ['mm', 'inches']}#
units = None#
variable = None#
variable_id = None#
variable_type = 'Variable'#
class climakitae.core.data_interface.DataParametersWithPanes(**params)#

Bases: DataParameters

params(_data_warning=String, _station_data_info=String, area_average=Selector, area_subset=Selector, cached_area=ListSelector, data_type=Selector, downscaling_method=Selector, extended_description=Selector, latitude=Range, longitude=Range, resolution=Selector, scenario_historical=ListSelector, scenario_ssp=ListSelector, simulation=ListSelector, station=ListSelector, time_slice=Range, timescale=Selector, units=Selector, variable=Selector, variable_id=ListSelector, variable_type=Selector, name=String) Extends DataParameters class to include panel widgets that display the time scale and a map overview Parameters of ‘DataParametersWithPanes’ =======================================  Parameters changed from their default values are marked in red. Soft bound values are marked in cyan. C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None

Name Value Type Bounds Mode 

area_subset None Selector V RW cached_area None ListSelector V RW latitude (32.5, 42) Range (10, 67) V RW longitude (-125.5, -114) Range (-156.82317, -84.18701) V RW variable_type ‘Variable’ Selector V RW time_slice (1980, 2015) Range (1950, 2100) V RW resolution ‘9 km’ Selector V RW timescale ‘monthly’ Selector V RW scenario_historical [‘Historical Climate’] ListSelector V RW area_average ‘No’ Selector V RW downscaling_method ‘Dynamical’ Selector V RW data_type ‘Gridded’ Selector V RW station None ListSelector V RW _station_data_info ‘’ String V RW scenario_ssp None ListSelector V RW simulation None ListSelector V RW variable None Selector V RW units None Selector V RW extended_description None Selector V RW variable_id None ListSelector V RW _data_warning ‘’ String V RW

Parameter docstrings: =====================

area_subset: < No docstring available > cached_area: < No docstring available > latitude: < No docstring available > longitude: < No docstring available > variable_type: Choose between variable or AE derived index time_slice: < No docstring available > resolution: < No docstring available > timescale: < No docstring available > scenario_historical: < No docstring available > area_average: Compute an area average? downscaling_method: < No docstring available > data_type: < No docstring available > station: < No docstring available > _station_data_info: Information about the bias correction process and resolution scenario_ssp: < No docstring available > simulation: < No docstring available > variable: < No docstring available > units: < No docstring available > extended_description: < No docstring available > variable_id: < No docstring available > _data_warning: Warning if user has made a bad selection

map_view()#

Create a map of the location selections

name = 'DataParametersWithPanes'#
scenario_view()#

Displays a timeline to help the user visualize the time ranges available, and the subset of time slice selected.

class climakitae.core.data_interface.Select(**params)#

Bases: DataParametersWithPanes

params(_data_warning=String, _station_data_info=String, area_average=Selector, area_subset=Selector, cached_area=ListSelector, data_type=Selector, downscaling_method=Selector, extended_description=Selector, latitude=Range, longitude=Range, resolution=Selector, scenario_historical=ListSelector, scenario_ssp=ListSelector, simulation=ListSelector, station=ListSelector, time_slice=Range, timescale=Selector, units=Selector, variable=Selector, variable_id=ListSelector, variable_type=Selector, name=String) Parameters of ‘Select’ ======================  Parameters changed from their default values are marked in red. Soft bound values are marked in cyan. C/V= Constant/Variable, RO/RW = ReadOnly/ReadWrite, AN=Allow None

Name Value Type Bounds Mode 

area_subset None Selector V RW cached_area None ListSelector V RW latitude (32.5, 42) Range (10, 67) V RW longitude (-125.5, -114) Range (-156.82317, -84.18701) V RW variable_type ‘Variable’ Selector V RW time_slice (1980, 2015) Range (1950, 2100) V RW resolution ‘9 km’ Selector V RW timescale ‘monthly’ Selector V RW scenario_historical [‘Historical Climate’] ListSelector V RW area_average ‘No’ Selector V RW downscaling_method ‘Dynamical’ Selector V RW data_type ‘Gridded’ Selector V RW station None ListSelector V RW _station_data_info ‘’ String V RW scenario_ssp None ListSelector V RW simulation None ListSelector V RW variable None Selector V RW units None Selector V RW extended_description None Selector V RW variable_id None ListSelector V RW _data_warning ‘’ String V RW

Parameter docstrings: =====================

area_subset: < No docstring available > cached_area: < No docstring available > latitude: < No docstring available > longitude: < No docstring available > variable_type: Choose between variable or AE derived index time_slice: < No docstring available > resolution: < No docstring available > timescale: < No docstring available > scenario_historical: < No docstring available > area_average: Compute an area average? downscaling_method: < No docstring available > data_type: < No docstring available > station: < No docstring available > _station_data_info: Information about the bias correction process and resolution scenario_ssp: < No docstring available > simulation: < No docstring available > variable: < No docstring available > units: < No docstring available > extended_description: < No docstring available > variable_id: < No docstring available > _data_warning: Warning if user has made a bad selection

name = 'Select'#
show()#
class climakitae.core.data_interface.VariableDescriptions#

Bases: object

Load Variable Desciptions CSV only once

This is a singleton class that needs to be called separately from DataInterface because variable descriptions are used without DataInterface in ck.view. Also ck.view is loaded on package load so this avoids loading boundary data when not needed.

load()#

climakitae.core.data_load module#

Backend functions for retrieving and subsetting data from the AE catalog

climakitae.core.data_load.area_subset_geometry(selections)#

Get geometry to perform area subsetting with.

Parameters:

selections (DataParameters) – object holding user’s selections

Returns:

ds_region (shapely.geometry) – geometry to use for subsetting

climakitae.core.data_load.downscaling_method_to_activity_id(downscaling_method, reverse=False)#

Convert downscaling method to activity id to match catalog names Set reverse=True to get downscaling method from input activity_id

climakitae.core.data_load.load(xr_da)#

Read data into memory

Parameters:

xr_da (xarray.DataArray)

Returns:

da_computed (xarray.DataArray)

climakitae.core.data_load.read_catalog_from_csv(selections, csv, merge=True)#

Retrieve user data selections from csv input.

Allows user to bypass ck.Select() GUI and allows developers to pre-set inputs in a csv file for ease of use in a notebook.

Parameters:
  • selections (DataParameters) – Data settings (variable, unit, timescale, etc).

  • csv (str) – Filepath to local csv file.

  • merge (bool, optional) – If multiple datasets desired, merge to form a single object? Default to True.

Returns:

  • one of the following, depending on csv input and merge

  • xr_ds (Dataset) – if multiple rows are in the csv, each row is a data_variable

  • xr_da (DataArray) – if csv only has one row

  • xr_list (list of xr.DataArrays) – if multiple rows are in the csv and merge=True, multiple DataArrays are returned in a single list.

climakitae.core.data_load.read_catalog_from_select(selections)#

The primary and first data loading method, called by core.Application.retrieve, it returns a DataArray (which can be quite large) containing everything requested by the user (which is stored in ‘selections’).

Parameters:

selections (DataParameters) – object holding user’s selections

Returns:

da (DataArray) – output data

climakitae.core.data_load.resolution_to_gridlabel(resolution, reverse=False)#

Convert resolution format to grid_label format matching catalog names. Set reverse=True to get resolution format from input grid_label.

climakitae.core.data_load.timescale_to_table_id(timescale, reverse=False)#

Convert resolution format to table_id format matching catalog names. Set reverse=True to get resolution format from input table_id.

climakitae.core.data_view module#

Backend function for creating generic visualizations of xarray DataArray.

climakitae.core.data_view.compute_vmin_vmax(da_min, da_max)#

Compute min, max, and center for plotting

climakitae.core.data_view.view(data, lat_lon=True, width=None, height=None, cmap=None)#

Create a generic visualization of the data

Visualization will depend on the shape of the input data. Works much faster if the data has already been loaded into memory.

Parameters:
  • data (DataArray) – Input data

  • lat_lon (bool, optional) – Reproject to lat/lon coords? Default to True.

  • width (int, optional) – Width of plot Default to hvplot default

  • height (int, optional) – Height of plot Default to hvplot.image default

  • cmap (matplotlib colormap name or AE colormap names) – Colormap to apply to data Default to “ae_orange” for mapped data or color-blind friendly “categorical_cb” for timeseries data.

Returns:

Raises:

UserWarning – Warn user that the function will be slow if data has not been loaded into memory

climakitae.core.paths module#

This module defines package level paths

Module contents#