climakitae.explore package#
Submodules#
climakitae.explore.agnostic module#
Backend for agnostic tools.
- climakitae.explore.agnostic.agg_area_subset_sims(area_subset, cached_area, downscaling_method, variable, agg_func, units, years, months=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], wrf_timescale='monthly')#
This function combines all available WRF or LOCA simulation data that is filtered on the area_subset (a string from existing keys in Boundaries.boundary_dict()) and on one of the areas of the values in that area_subset (cached_area). It then extracts this data across all SSP pathways for specific years/months, and runs the passed in agg_func on all of this data. The results are then returned in 3 values, the first as a dict of statistic names to xr.DataArray single simulation objects (i.e. median), the second as a dict of statistic names to xr.DataArray objects consisting of multiple simulation objects (i.e. middle 10%), and the last as a xr.DataArray of simulations’ aggregated values sorted in ascending order.
- Parameters:
area_subset (
str
) – Describes the category of the boundaries of interest (i.e. “CA Electric Load Serving Entities (IOU & POU)”)cached_area (
str
) – Describes the specific area of interest (i.e. “Southern California Edison”)agg_func (
str
) – The metric to aggregate the simulations by.years (
tuple
) – The lower and upper year bounds (inclusive) to extract simulation data by.months (
list
, optional) – Specific months of interest. The default is all months.
- Returns:
single_stats (
dict
ofstr
:DataArray
) – Dictionary mapping string names of statistics to single simulation xr.DataArray objects.multiple_stats (
dict
ofstr
:DataArray
) – Dictionary mapping string names of statistics to multiple simulations xr.DataArray objects.results (
DataArray
) – Aggregated results of running the given aggregation function on the lat/lon gridcell of interest. Results are also sorted in ascending order.
- climakitae.explore.agnostic.agg_lat_lon_sims(lat: float | Tuple[float, float], lon: float | Tuple[float, float], downscaling_method, variable, agg_func, units, years, months=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], wrf_timescale='monthly')#
Gets aggregated WRF or LOCA simulation data for a lat/lon coordinate or lat/lon range for a given metric and timeframe (years, months). It combines all selected simulation data that is filtered by lat/lon, years, and specific months across SSP pathways and runs the passed in metric on all of the data. The results are then returned in ascending order, along with dictionaries mapping specific statistic names to the simulation objects themselves.
- Parameters:
lat (
float
) – Latitude for specific location of interest.lon (
float
) – Longitude for specific location of interest.agg_func (
str
) – The function to aggregate the simulations by.years (
tuple
) – The lower and upper year bounds (inclusive) to subset simulation data by.months (
list
, optional) – Specific months of interest. The default is all months.
- Returns:
single_stats (
dict
ofstr
:DataArray
) – Dictionary mapping string names of statistics to single simulation xr.DataArray objects.multiple_stats (
dict
ofstr
:DataArray
) – Dictionary mapping string names of statistics to multiple simulations xr.DataArray objects.results (
DataArray
) – Aggregated results of running the given aggregation function on the lat/lon gridcell of interest. Results are also sorted in ascending order.
- climakitae.explore.agnostic.create_lookup_tables()#
Create lookup tables for converting between warming level and time.
- Returns:
dict
ofpandas.DataFrame
– A dictionary containing two dataframes: “time lookup table” which maps warming levels to their occurence times for each GCM simulation we catalog, and “warming level lookup table” which contains yearly warming levels for those simulations.
- climakitae.explore.agnostic.get_available_units(variable, downscaling_method, wrf_timescale='monthly')#
Get other available units available for the given unit
- climakitae.explore.agnostic.show_available_vars(downscaling_method, wrf_timescale='monthly')#
Function that shows the available variables based on the input downscaling method.
- climakitae.explore.agnostic.warm_level_to_month(time_df, scenario, warming_level)#
Given warming level, give month.
- climakitae.explore.agnostic.year_to_warm_levels(warm_df, scenario, year)#
Given year, give warming levels and their median.
climakitae.explore.amy module#
Calculates the Average Meterological Year (AMY) and Severe Meteorological Year (SMY) for the Cal-Adapt: Analytics Engine using a standard climatological period (1981-2010) for the historical baseline, and uses a 30-year window around when a designated warming level is exceeded for the SSP3-7.0 future scenario for 1.5°C, 2°C, and 3°C. The AMY is comparable to a typical meteorological year, but not quite the same full methodology.
- climakitae.explore.amy.compute_amy(data, days_in_year=366, show_pbar=False)
Calculates the average meteorological year based on a designated period of time
Applicable for both the historical and future periods.
- Parameters:
data (
DataArray
) – Hourly data for one variabledays_in_year (
int
, optional) – Either 366 or 365, depending on whether or not the year is a leap year. Default to 366 days (leap year)show_pbar (
bool
, optional) – Show progress bar? Default to false. Progress bar is nice for using this function within a notebook.
- Returns:
DataFrame
– Average meteorological year table, with days of year as the index and hour of day as the columns.
- climakitae.explore.amy.compute_mean_monthly_meteo_yr(tmy_df, col_name='mean_value')
Compute mean monthly values for input meteorological year data.
- Parameters:
- Returns:
DataFrame
– Table with month as index and monthly mean as column
- climakitae.explore.amy.compute_severe_yr(data, days_in_year=366, show_pbar=False)
Calculate the severe meteorological year based on the 90th percentile of data.
Applicable for both the historical and future periods.
- Parameters:
data (
DataArray
) – Hourly data for one variabledays_in_year (
int
, optional) – Either 366 or 365, depending on whether or not the year is a leap year. Default to 366 days (leap year)show_pbar (
bool
, optional) – Show progress bar? Default to false. Progress bar is nice for using this function within a notebook.
- Returns:
DataFrame
– Severe meteorological year table, with days of year as the index and hour of day as the columns.
- climakitae.explore.amy.retrieve_meteo_yr_data(self, ssp=None, year_start=2015, year_end=None)
Backend function for retrieving data needed for computing a meteorological year.
Reads in the hourly ensemble means instead of the hourly data. Reads in future SSP data, historical climate data, or a combination of both, depending on year_start and year_end
- Parameters:
self (
AverageMetYearParameters
)ssp (
str
) – one of “SSP 2-4.5 – Middle of the Road”, “SSP 2-4.5 – Middle of the Road”, “SSP 3-7.0 – Business as Usual”, “SSP 5-8.5 – Burn it All” Shared Socioeconomic Pathway. Defaults to SSP 3-7.0 – Business as Usualyear_start (
int
, optional) – Year between 1980-2095. Default to 2015year_end (
int
, optional) – Year between 1985-2100. Default to year_start+30
- Returns:
DataArray
– Hourly ensemble means from year_start-year_end for the ssp specified.
climakitae.explore.threshold_tools module#
Helper functions for performing analyses related to thresholds
- climakitae.explore.threshold_tools.calculate_ess(data, nlags=None)#
Function for calculating the effective sample size (ESS) of the provided data.
- Parameters:
- Returns:
DataArray
– Effective sample size. Returned as a DataArray object so it can be utilized by xr.groupby and xr.resample.
- climakitae.explore.threshold_tools.exceedance_plot_subtitle(exceedance_count)#
Function of build exceedance plot subtitle
Helper function for making the subtile for exceedance plots.
- Parameters:
exceedance_count (
xarray.DataArray
)- Returns:
Examples
‘Number of hours per year’ ‘Number of 4-hour events per 3-months’ ‘Number of days per year with conditions lasting at least 4-hours’
- climakitae.explore.threshold_tools.exceedance_plot_title(exceedance_count)#
Function to build title for exceedance plots
Helper function for making the title for exceedance plots.
- Parameters:
exceedance_count (
xarray.DataArray
)- Returns:
Examples
‘Air Temperatue at 2m: events above 35C’ ‘Preciptation (total): events below 10mm’
- climakitae.explore.threshold_tools.get_block_maxima(da_series, extremes_type='max', duration=None, groupby=None, grouped_duration=None, check_ess=True, block_size=1)#
Function that converts data into block maximums, defaulting to annual maximums (default block size = 1 year).
Takes input array and resamples by taking the maximum value over the specified block size.
Optional arguments duration, groupby, and grouped_duration define the type of event to find the annual maximums of. These correspond to the event types defined in the get_exceedance_count function.
- Parameters:
da (
xarray.DataArray
) – DataArray from retrieveextremes_type (
str
) – option for max or min Defaults to maxduration (
tuple
) – length of extreme event, specified as (4, ‘hour’)groupby (
tuple
) – group over which to look for max occurance, specified as (1, ‘day’)grouped_duration (
tuple
) – length of event after grouping, specified as (5, ‘day’)check_ess (
boolean
) – optional flag specifying whether to check the effective sample size (ESS) within the blocks of data, and throw a warning if the average ESS is too small. can be silenced with check_ess=False.block_size (
int
) – block size in years. default is 1 year.
- Returns:
- climakitae.explore.threshold_tools.get_exceedance_count(da, threshold_value, duration1=None, period=(1, 'year'), threshold_direction='above', duration2=None, groupby=None, smoothing=None)#
Calculate the number of occurances of exceeding the specified threshold within each period.
Returns an xarray.DataArray with the same coordinates as the input data except for the time dimension, which will be collapsed to one value per period (equal to the number of event occurances in each period).
- Parameters:
da (
xarray.DataArray
) – array of some climate variable. Can have multiple scenarios, simulations, or x and y coordinates.threshold_value (
float
) – value against which to test exceedanceperiod (
int
) – amount of time across which to sum the number of occurances, default is (1, “year”). Specified as a tuple: (x, time) where x is an integer, and time is one of: [“day”, “month”, “year”]threshold_direction (
str
) – either “above” or “below”, default is above.duration1 (
tuple
) – length of exceedance in order to qualify as an event (before grouping)groupby (
tuple
) – see examples for explanation. Typical grouping could be (1, “day”)duration2 (
tuple
) – length of exceedance in order to qualify as an event (after grouping)smoothing (
int
) – option to average the result across multiple periods with a rolling average; value is either None or the number of timesteps to use as the window size
- Returns:
- climakitae.explore.threshold_tools.get_ks_stat(bms, distr='gev', multiple_points=True)#
Function to perform kstest on input DataArray
Creates a dataset of ks test d-statistics and p-values from an inputed maximum series.
- Parameters:
bms (
xarray.DataArray
) – Block maximum series, can be output from the function get_block_maxima()distr (
str
)multiple_points (
boolean
)
- Returns:
- climakitae.explore.threshold_tools.get_return_period(bms, return_value, distr='gev', bootstrap_runs=100, conf_int_lower_bound=2.5, conf_int_upper_bound=97.5, multiple_points=True)#
Creates xarray Dataset with return periods and confidence intervals from maximum series.
- Parameters:
bms (
xarray.DataArray
) – Block maximum series, can be output from the function get_block_maxima()return_value (
float
) – The threshold value for which to calculate the return period of occurancedistr (
str
) – The type of extreme value distribution to fitbootstrap_runs (
int
) – Number of bootstrap samplesconf_int_lower_bound (
float
) – Confidence interval lower boundconf_int_upper_bound (
float
) – Confidence interval upper boundmultiple_points (
boolean
) – Whether or not the data contains multiple points (has x, y dimensions)
- Returns:
xarray.Dataset
– Dataset with return periods and confidence intervals
- climakitae.explore.threshold_tools.get_return_prob(bms, threshold, distr='gev', bootstrap_runs=100, conf_int_lower_bound=2.5, conf_int_upper_bound=97.5, multiple_points=True)#
Creates xarray Dataset with return probabilities and confidence intervals from maximum series.
- Parameters:
bms (
xarray.DataArray
) – Block maximum series, can be output from the function get_block_maxima()threshold (
float
) – The threshold value for which to calculate the probability of exceedancedistr (
str
) – The type of extreme value distribution to fitbootstrap_runs (
int
) – Number of bootstrap samplesconf_int_lower_bound (
float
) – Confidence interval lower boundconf_int_upper_bound (
float
) – Confidence interval upper boundmultiple_points (
boolean
) – Whether or not the data contains multiple points (has x, y dimensions)
- Returns:
xarray.Dataset
– Dataset with return probabilities and confidence intervals
- climakitae.explore.threshold_tools.get_return_value(bms, return_period=10, distr='gev', bootstrap_runs=100, conf_int_lower_bound=2.5, conf_int_upper_bound=97.5, multiple_points=True)#
Creates xarray Dataset with return values and confidence intervals from maximum series.
- Parameters:
bms (
xarray.DataArray
) – Block maximum series, can be output from the function get_block_maxima()return_period (
float
) – The recurrence interval (in years) for which to calculate the return valuedistr (
str
) – The type of extreme value distribution to fitbootstrap_runs (
int
) – Number of bootstrap samplesconf_int_lower_bound (
float
) – Confidence interval lower boundconf_int_upper_bound (
float
) – Confidence interval upper boundmultiple_points (
boolean
) – Whether or not the data contains multiple points (has x, y dimensions)
- Returns:
xarray.Dataset
– Dataset with return values and confidence intervals
climakitae.explore.thresholds module#
- climakitae.explore.thresholds.get_threshold_data(self)
This function pulls data from the catalog and reads it into memory
- Parameters:
selections (
DataParameters
) – object holding user’s selectionscat (
intake_esm.core.esm_datastore
) – data catalog
- Returns:
data (
DataArray
) – data to use for creating postage stamp data
climakitae.explore.timeseries module#
- class climakitae.explore.timeseries.TimeSeries(data)#
Bases:
object
Holds the instance of TimeSeriesParameters that is used for the following purposes: 1) to display a panel that previews various time-series transforms (explore), and 2) to save the transform represented by the current state of that preview into a new variable (output_current).
- class climakitae.explore.timeseries.TimeSeriesParameters(dataset, **params)#
Bases:
Parameterized
Class of python Param to hold parameters for Time Series.
- anomaly = True#
- extremes = []#
- name = 'TimeSeriesParameters'#
- num_timesteps = 0#
- percentile = 0#
- reference_range = (datetime.datetime(1981, 1, 1, 0, 0), datetime.datetime(2010, 12, 31, 0, 0))#
- remove_seasonal_cycle = False#
- resample_period = 'years'#
- resample_window = 1#
- separate_seasons = False#
- smoothing = 'None'#
- transform_data()#
Returns a dataset that has been transformed in the ways that the params indicate, ready to plot in the preview window (“view” method of this class), or be saved out.
- update_anom()#
- update_seasonal_cycle()#
climakitae.explore.uncertainty module#
- class climakitae.explore.uncertainty.CmipOpt(variable='tas', area_subset='states', location='California', timescale='monthly', area_average=True)#
Bases:
object
A class for holding relevant data options for cmip preprocessing
- Parameters:
- _cmip_clip()#
CMIP6-specific subsetting
- climakitae.explore.uncertainty.calc_anom(ds_yr, base_start, base_end)#
Calculates the difference relative to a historical baseline.
First calculates a baseline per simulation using input (base_start, base_end). Then calculates the anomaly from baseline per simulation.
- climakitae.explore.uncertainty.cmip_mmm(ds)#
Calculate the CMIP6 multi-model mean by collapsing across simulations.
- climakitae.explore.uncertainty.compute_vmin_vmax(da_min, da_max)#
Computes min, max, and center for plotting.
- climakitae.explore.uncertainty.get_ensemble_data(variable, selections, cmip_names, warm_level=3.0)#
Returns processed data from multiple CMIP6 models for uncertainty analysis.
Searches the CMIP6 data catalog for data from models that have specific ensemble member id in the historical and ssp370 runs. Preprocessing includes subsetting for specific location and dropping the member_id for easier analysis.
Get’s future data at warming level range. Slices historical period to 1981-2010.
- climakitae.explore.uncertainty.get_ks_pval_df(sample1, sample2, sig_lvl=0.05)#
Performs a Kolmogorov-Smirnov test at all lat, lon points
- climakitae.explore.uncertainty.get_warm_level(warm_level, ds, multi_ens=False, ipcc=True)#
Subsets projected data centered to the year that the selected warming level is reached for a particular simulation/member_id
- Parameters:
- Returns:
Dataset
– Subset of projected data -14/+15 years from warming level threshold
- climakitae.explore.uncertainty.grab_multimodel_data(copt, alpha_sort=False)#
Returns processed data from multiple CMIP6 models for uncertainty analysis.
Searches the CMIP6 data catalog for data from models that have specific ensemble member id in the historical and ssp370 runs. Preprocessing includes subsetting for specific location and dropping the member_id for easier analysis.
- climakitae.explore.uncertainty.weighted_temporal_mean(ds)#
weight by days in each month
Function for calculating annual averages pulled + adapted from NCAR Link: https://ncar.github.io/esds/posts/2021/yearly-averages-xarray/
- Parameters:
ds (
xarray.DataArray
)- Returns:
obs_sum / ones_out (
xarray.Dataset
)
climakitae.explore.vulnerability module#
Tools for CAVA vulnerability assessment pilot
- class climakitae.explore.vulnerability.CavaParams(**params)#
Bases:
Parameterized
Validates parameters for the cava_data function, and returns transformed parameters, if needed. Transformed parameters returned: ssp_selected, variable_type.
- approach = 'Time'#
- batch_mode = False#
- distr = 'gev'#
- downscaling_method = 'Dynamical'#
- export_method = 'both'#
- file_format = 'NetCDF'#
- get_names()#
- heat_idx_threshold = None#
- historical_data = 'Historical Climate'#
- input_locations = None#
- metric_calc = 'max'#
- name = 'CavaParams'#
- one_in_x = None#
- percentile = None#
- season = 'all'#
- separate_files = False#
- ssp_data = ['SSP3-7.0']#
- time_end_year = 2010#
- time_start_year = 1981#
- units = 'Celsius'#
- validate_params()#
- variable = 'Air Temperature at 2m'#
- warming_level = 1.5#
- wrf_bias_adjust = True#
- climakitae.explore.vulnerability.cava_data(input_locations, variable, units=None, approach='Time', downscaling_method='Dynamical', time_start_year=1981, time_end_year=2010, historical_data='Historical Climate', ssp_data=['SSP3-7.0'], warming_level=1.5, metric_calc='max', heat_idx_threshold=None, one_in_x=None, event_duration=(1, 'day'), percentile=None, season='all', wrf_bias_adjust=True, export_method='both', separate_files=True, file_format='NetCDF', batch_mode=False, distr='gev')#
Retrieve, process, and export climate data based on inputs.
Designed for CAVA reports.
- Parameters:
input_locations (
pandas.DataFrame
) – Input locations containing ‘lat’ and ‘lon’ columns.variable (
str
) – Type of climate variable to retrieve and calculate.approach (
str
) – Approach to follow, default is “Time”.downscaling_method (
str
) – Method of downscaling, default is “Dynamical”.time_start_year (
int
, optional) – Starting year for data selection.time_end_year (
int
, optional) – Ending year for data selection.historical_data (
str
, optional) – Type of historical data, default is “Historical Climate”.ssp_data (
str
, optional) – Shared Socioeconomic Pathway data, default is “SSP3-7.0”.metric_calc (
str
, optional) – Metric calculation type (e.g., ‘mean’, ‘max’, ‘min’) for supported metrics. Default is “max”heat_idx_threshold (
float
) – Heat index threshold for counting events.one_in_x (
int
, optional) – Return period for 1-in-X events.percentile (
int
, optional) – Percentile for calculating “likely” event occurrence.season (
str
, optional) – Season to subset time dimension on (e.g., ‘summer’, ‘winter’, ‘all’). Default is ‘all’.units (
str
, optional) – Units for the retrieved data.wrf_bias_adjust (
str
, optional) – Flag to subset the WRF data for the bias-adjusted models. Default is True.export_method (
str
, optional) – Export method, options are ‘raw’, ‘calculate’, ‘both’, default is ‘both’.file_format (
str
, optional) – Export file format options.separate_files (
bool
, optional) – Whether to separate climate variable information into separate files, default is True.
- Returns:
metric_data (
xarray.DataArray
) – Computed climate metrics for input locations.- Raises:
ValueError – If input coordinates lack ‘lat’ and ‘lon’ columns or if ‘lat’/’lon’ columns are not of type float64 or int64.
climakitae.explore.vulnerability_table module#
- climakitae.explore.vulnerability_table.create_vul_table(example_loc, percentile, heat_idx_threshold, one_in_x)#
Creates a vulnerability assessment table and exports the table to CSV.
climakitae.explore.warming module#
Helper functions for performing analyses related to global warming levels, along with backend code for building the warming levels GUI
- class climakitae.explore.warming.WarmingLevelChoose(*args, **params)#
Bases:
DataParameters
- anom = 'Yes'#
- name = 'WarmingLevelChoose'#
- window = 15#
- class climakitae.explore.warming.WarmingLevels(**params)#
Bases:
object
A container for all of the warming levels-related functionality: - A pared-down Select panel, under “choose_data” - a “calculate” step where most of the waiting occurs - an optional “visualize” panel, as an instance of WarmingLevelVisualize - postage stamps from visualize “main” tab are accessible via “gwl_snapshots” - data sliced around gwl window retrieved from “sliced_data”
- calculate()#
- catalog_data = <xarray.DataArray ()> array(nan)#
- find_warming_slice(level, gwl_times)#
Find the warming slice data for the current level from the catalog data.
- gwl_snapshots = <xarray.DataArray ()> array(nan)#
- sliced_data = <xarray.DataArray ()> array(nan)#
- climakitae.explore.warming.clean_list(data, gwl_times)#
- climakitae.explore.warming.clean_warm_data(warm_data)#
- Cleaning the warming levels data in 3 parts:
Removing simulations where this warming level is not crossed. (centered_year)
Removing timestamps at the end to account for leap years (time)
Removing simulations that go past 2100 for its warming level window (all_sims)
- climakitae.explore.warming.get_sliced_data(y, level, years, months=array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]), window=15, anom='Yes')#
Calculating warming level anomalies.
- Parameters:
y (
DataArray
) – Data to compute warming level anomolies, one simulation at a time via groupbylevel (
str
) – Warming level amountyears (
DataFrame
) – Lookup table for the date a given simulation reaches each warming level.months (
np.ndarray
) – Months to include in a warming level slice.window (
int
, optional) – Number of years to generate time window for. Default to 15 years. For example, a 15 year window would generate a window of 15 years in the past from the central warming level date, and 15 years into the future. I.e. if a warming level is reached in 2030, the window would be (2015,2045).scenario (
str
,one
of"ssp370"
,"ssp585"
,"ssp245"
) – Shared Socioeconomic Pathway. Default to SSP 3-7.0
- Returns:
anomaly_da (
DataArray
)
- climakitae.explore.warming.process_item(y)#
- climakitae.explore.warming.relabel_axis(all_sims_dim)#
Module contents#
- climakitae.explore.warming_levels()#
Top level alias for the WarmingLevels class. Typical way to call class.